* [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
@ 2024-06-11 15:32 Andrew Bresticker
2024-06-11 15:33 ` David Hildenbrand
2024-06-11 18:03 ` Matthew Wilcox
0 siblings, 2 replies; 12+ messages in thread
From: Andrew Bresticker @ 2024-06-11 15:32 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand
Cc: linux-mm, linux-kernel, Andrew Bresticker
The requirement that the head page be passed to do_set_pmd() was added
in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
finish_fault() and filemap_map_pages() paths if the page to be inserted
is anything but the head page for an otherwise suitable vma and pmd-sized
page.
Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
---
mm/memory.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index 0f47a533014e..a1fce5ddacb3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
return ret;
- if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
+ if (folio_order(folio) != HPAGE_PMD_ORDER)
return ret;
+ page = &folio->page;
/*
* Just backoff if any subpage of a THP is corrupted otherwise
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 15:32 [PATCH v2] mm/memory: Don't require head page for do_set_pmd() Andrew Bresticker
@ 2024-06-11 15:33 ` David Hildenbrand
2024-06-11 18:06 ` Andrew Morton
2024-06-11 18:03 ` Matthew Wilcox
1 sibling, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2024-06-11 15:33 UTC (permalink / raw)
To: Andrew Bresticker, Andrew Morton; +Cc: linux-mm, linux-kernel
On 11.06.24 17:32, Andrew Bresticker wrote:
> The requirement that the head page be passed to do_set_pmd() was added
> in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> finish_fault() and filemap_map_pages() paths if the page to be inserted
> is anything but the head page for an otherwise suitable vma and pmd-sized
> page.
>
> Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> ---
> mm/memory.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 0f47a533014e..a1fce5ddacb3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> return ret;
>
> - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> + if (folio_order(folio) != HPAGE_PMD_ORDER)
> return ret;
> + page = &folio->page;
>
> /*
> * Just backoff if any subpage of a THP is corrupted otherwise
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 15:32 [PATCH v2] mm/memory: Don't require head page for do_set_pmd() Andrew Bresticker
2024-06-11 15:33 ` David Hildenbrand
@ 2024-06-11 18:03 ` Matthew Wilcox
2024-06-11 18:21 ` Andrew Morton
1 sibling, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2024-06-11 18:03 UTC (permalink / raw)
To: Andrew Bresticker
Cc: Andrew Morton, David Hildenbrand, linux-mm, linux-kernel
On Tue, Jun 11, 2024 at 08:32:16AM -0700, Andrew Bresticker wrote:
> - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> + if (folio_order(folio) != HPAGE_PMD_ORDER)
> return ret;
> + page = &folio->page;
This works today, but in about six months time it's going to be a pain.
+ page = folio_page(folio, 0);
is the one which works today and in the future.
Of course, as I'm writing this now I'm thinking we could make this work
for folios which are larger than PMD size. Which we currently prohibit
(at least in part because this function doesn't work with folios larger
than PMD size)
pgoff_t idx = ((haddr - vma->vm_start) / PAGE_SIZE) + vma->vm_pgoff;
page = folio_file_page(folio, idx);
(there might be a more succinct way to write that, and I might have
messed up the polarity on something).
Worth starting to fix these places that don't work with folios larger
than PMD size? I think it'd be worth supporting, eg 4MB folios.
Current bandwidth of a gen5 x8 link is 32GB/s, so we can read 8,000
4MB pages/second at a latency of 125us. My laptop only has a gen4 x2
SSD, so a 4MB folio would take 1ms to read which might be a little
excessive for normal use. Of course allocating an order 10 folio is
hard, so it's not going to happen terribly often anyway.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 15:33 ` David Hildenbrand
@ 2024-06-11 18:06 ` Andrew Morton
2024-06-11 18:22 ` Matthew Wilcox
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-06-11 18:06 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Andrew Bresticker, linux-mm, linux-kernel
On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
> On 11.06.24 17:32, Andrew Bresticker wrote:
> > The requirement that the head page be passed to do_set_pmd() was added
> > in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> > folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> > finish_fault() and filemap_map_pages() paths if the page to be inserted
> > is anything but the head page for an otherwise suitable vma and pmd-sized
> > page.
> >
> > Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> > Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> > ---
> > mm/memory.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0f47a533014e..a1fce5ddacb3 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> > return ret;
> >
> > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > return ret;
> > + page = &folio->page;
> >
> > /*
> > * Just backoff if any subpage of a THP is corrupted otherwise
>
> Acked-by: David Hildenbrand <david@redhat.com>
You know what I'm going to ask ;) I'm assuming that the runtime effects
are "small performance optimization" and that "should we backport the
fix" is "no".
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 18:03 ` Matthew Wilcox
@ 2024-06-11 18:21 ` Andrew Morton
2024-06-11 18:38 ` Matthew Wilcox
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-06-11 18:21 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Bresticker, David Hildenbrand, linux-mm, linux-kernel
On Tue, 11 Jun 2024 19:03:29 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Jun 11, 2024 at 08:32:16AM -0700, Andrew Bresticker wrote:
> > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > return ret;
> > + page = &folio->page;
>
> This works today, but in about six months time it's going to be a pain.
>
> + page = folio_page(folio, 0);
>
> is the one which works today and in the future.
I was wondering about that.
hp2:/usr/src/25> fgrep "&folio->page" mm/*.c | wc -l
84
hp2:/usr/src/25> fgrep "folio_page(" mm/*.c | wc -l
35
Should these all be converted? What's the general rule here?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 18:06 ` Andrew Morton
@ 2024-06-11 18:22 ` Matthew Wilcox
2024-06-11 20:07 ` Andrew Morton
2024-08-19 8:16 ` Vincent Donnefort
0 siblings, 2 replies; 12+ messages in thread
From: Matthew Wilcox @ 2024-06-11 18:22 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Andrew Bresticker, linux-mm, linux-kernel
On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
> On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
>
> > On 11.06.24 17:32, Andrew Bresticker wrote:
> > > The requirement that the head page be passed to do_set_pmd() was added
> > > in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> > > folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> > > finish_fault() and filemap_map_pages() paths if the page to be inserted
> > > is anything but the head page for an otherwise suitable vma and pmd-sized
> > > page.
> > >
> > > Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> > > Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> > > ---
> > > mm/memory.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 0f47a533014e..a1fce5ddacb3 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> > > return ret;
> > >
> > > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > > return ret;
> > > + page = &folio->page;
> > >
> > > /*
> > > * Just backoff if any subpage of a THP is corrupted otherwise
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
>
> You know what I'm going to ask ;) I'm assuming that the runtime effects
> are "small performance optimization" and that "should we backport the
> fix" is "no".
We're going to stop using PMDs to map large folios unless the fault is
within the first 4KiB of the PMD. No idea how many workloads that
affects, but it only needs to be backported as far as v6.8, so we
may as well backport it.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 18:21 ` Andrew Morton
@ 2024-06-11 18:38 ` Matthew Wilcox
0 siblings, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2024-06-11 18:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Andrew Bresticker, David Hildenbrand, linux-mm, linux-kernel
On Tue, Jun 11, 2024 at 11:21:31AM -0700, Andrew Morton wrote:
> On Tue, 11 Jun 2024 19:03:29 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Tue, Jun 11, 2024 at 08:32:16AM -0700, Andrew Bresticker wrote:
> > > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > > return ret;
> > > + page = &folio->page;
> >
> > This works today, but in about six months time it's going to be a pain.
> >
> > + page = folio_page(folio, 0);
> >
> > is the one which works today and in the future.
>
> I was wondering about that.
>
> hp2:/usr/src/25> fgrep "&folio->page" mm/*.c | wc -l
> 84
> hp2:/usr/src/25> fgrep "folio_page(" mm/*.c | wc -l
> 35
>
> Should these all be converted? What's the general rule here?
The rule is ...
- If we haven't thought about it, use &folio->page to indicate that
somebody needs to think about it.
- If the code needs to be modified to split folio and page apart, use
&folio->page.
- If the code is part of compat code which is going to have to be
removed, use &folio->page (eg do_read_cache_page()).
To *think* about it, and use folio_page() or folio_file_page(), don't
just blindly pass 0 as the second argument. Think about which page
within the folio is expected by the function you're working on.
Often that is "the first one!" and so folio_page(folio, 0) is the
right answer. But that should be justified.
It might be the right answer is "Oh, that function should take a folio".
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 18:22 ` Matthew Wilcox
@ 2024-06-11 20:07 ` Andrew Morton
2024-06-11 21:18 ` Hugh Dickins
2024-08-19 8:16 ` Vincent Donnefort
1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-06-11 20:07 UTC (permalink / raw)
To: Matthew Wilcox
Cc: David Hildenbrand, Andrew Bresticker, linux-mm, linux-kernel
On Tue, 11 Jun 2024 19:22:03 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
> > On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
> >
> > > On 11.06.24 17:32, Andrew Bresticker wrote:
> > > > The requirement that the head page be passed to do_set_pmd() was added
> > > > in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> > > > folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> > > > finish_fault() and filemap_map_pages() paths if the page to be inserted
> > > > is anything but the head page for an otherwise suitable vma and pmd-sized
> > > > page.
> > > >
> > > > Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> > > > Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> > > > ---
> > > > mm/memory.c | 3 ++-
> > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 0f47a533014e..a1fce5ddacb3 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> > > > return ret;
> > > >
> > > > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > > > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > > > return ret;
> > > > + page = &folio->page;
> > > >
> > > > /*
> > > > * Just backoff if any subpage of a THP is corrupted otherwise
> > >
> > > Acked-by: David Hildenbrand <david@redhat.com>
> >
> > You know what I'm going to ask ;) I'm assuming that the runtime effects
> > are "small performance optimization" and that "should we backport the
> > fix" is "no".
>
> We're going to stop using PMDs to map large folios unless the fault is
> within the first 4KiB of the PMD. No idea how many workloads that
> affects, but it only needs to be backported as far as v6.8, so we
> may as well backport it.
OK, thanks, I pasted the above text and added the cc:stable.
I didn't move it into the hotfixes queue - it's a non-trivial
behavioral change and extra test time seems prudent(?).
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 20:07 ` Andrew Morton
@ 2024-06-11 21:18 ` Hugh Dickins
2024-06-12 18:41 ` David Hildenbrand
0 siblings, 1 reply; 12+ messages in thread
From: Hugh Dickins @ 2024-06-11 21:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Matthew Wilcox, David Hildenbrand, Andrew Bresticker, linux-mm,
linux-kernel
On Tue, 11 Jun 2024, Andrew Morton wrote:
> On Tue, 11 Jun 2024 19:22:03 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> > On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
> > > On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
> > >
> > > > On 11.06.24 17:32, Andrew Bresticker wrote:
> > > > > The requirement that the head page be passed to do_set_pmd() was added
> > > > > in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> > > > > folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> > > > > finish_fault() and filemap_map_pages() paths if the page to be inserted
> > > > > is anything but the head page for an otherwise suitable vma and pmd-sized
> > > > > page.
> > > > >
> > > > > Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> > > > > Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> > > > > ---
> > > > > mm/memory.c | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > > index 0f47a533014e..a1fce5ddacb3 100644
> > > > > --- a/mm/memory.c
> > > > > +++ b/mm/memory.c
> > > > > @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > > > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> > > > > return ret;
> > > > >
> > > > > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > > > > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > > > > return ret;
> > > > > + page = &folio->page;
> > > > >
> > > > > /*
> > > > > * Just backoff if any subpage of a THP is corrupted otherwise
> > > >
> > > > Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
> > >
> > > You know what I'm going to ask ;) I'm assuming that the runtime effects
> > > are "small performance optimization" and that "should we backport the
> > > fix" is "no".
> >
> > We're going to stop using PMDs to map large folios unless the fault is
> > within the first 4KiB of the PMD. No idea how many workloads that
> > affects, but it only needs to be backported as far as v6.8, so we
> > may as well backport it.
>
> OK, thanks, I pasted the above text and added the cc:stable.
Yes please. My interest in this being that yesterday I discovered
the large drop in ShmemPmdMapped between v6.7 and v6.8, bisected,
and was testing overnight with a patch very much like this one of
Andrew's. I'd been hoping to send mine today, but now no need.
>
> I didn't move it into the hotfixes queue - it's a non-trivial
> behavioral change and extra test time seems prudent(?).
It is certainly worth some test soak time, and the bug might have
been masking other issues which may now emerge; but the fix is
just reverting to the old pre-v6.8 behaviour.
Thanks,
Hugh
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 21:18 ` Hugh Dickins
@ 2024-06-12 18:41 ` David Hildenbrand
0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2024-06-12 18:41 UTC (permalink / raw)
To: Hugh Dickins, Andrew Morton
Cc: Matthew Wilcox, Andrew Bresticker, linux-mm, linux-kernel
On 11.06.24 23:18, Hugh Dickins wrote:
> On Tue, 11 Jun 2024, Andrew Morton wrote:
>> On Tue, 11 Jun 2024 19:22:03 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>>> On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
>>>> On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>>> On 11.06.24 17:32, Andrew Bresticker wrote:
>>>>>> The requirement that the head page be passed to do_set_pmd() was added
>>>>>> in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
>>>>>> folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
>>>>>> finish_fault() and filemap_map_pages() paths if the page to be inserted
>>>>>> is anything but the head page for an otherwise suitable vma and pmd-sized
>>>>>> page.
>>>>>>
>>>>>> Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
>>>>>> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
>>>>>> ---
>>>>>> mm/memory.c | 3 ++-
>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>>>> index 0f47a533014e..a1fce5ddacb3 100644
>>>>>> --- a/mm/memory.c
>>>>>> +++ b/mm/memory.c
>>>>>> @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
>>>>>> if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
>>>>>> return ret;
>>>>>>
>>>>>> - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
>>>>>> + if (folio_order(folio) != HPAGE_PMD_ORDER)
>>>>>> return ret;
>>>>>> + page = &folio->page;
>>>>>>
>>>>>> /*
>>>>>> * Just backoff if any subpage of a THP is corrupted otherwise
>>>>>
>>>>> Acked-by: David Hildenbrand <david@redhat.com>
>
> Acked-by: Hugh Dickins <hughd@google.com>
>
>>>>
>>>> You know what I'm going to ask ;) I'm assuming that the runtime effects
>>>> are "small performance optimization" and that "should we backport the
>>>> fix" is "no".
>>>
>>> We're going to stop using PMDs to map large folios unless the fault is
>>> within the first 4KiB of the PMD. No idea how many workloads that
>>> affects, but it only needs to be backported as far as v6.8, so we
>>> may as well backport it.
>>
>> OK, thanks, I pasted the above text and added the cc:stable.
>
> Yes please. My interest in this being that yesterday I discovered
> the large drop in ShmemPmdMapped between v6.7 and v6.8, bisected,
> and was testing overnight with a patch very much like this one of
> Andrew's. I'd been hoping to send mine today, but now no need.
>
>>
>> I didn't move it into the hotfixes queue - it's a non-trivial
>> behavioral change and extra test time seems prudent(?).
>
> It is certainly worth some test soak time, and the bug might have
> been masking other issues which may now emerge; but the fix is
> just reverting to the old pre-v6.8 behaviour.
Right, I don't expect surprises, really. I'm rather surprised that
nobody noticed and that the usual 0-day benchmarks don't trigger that case.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-06-11 18:22 ` Matthew Wilcox
2024-06-11 20:07 ` Andrew Morton
@ 2024-08-19 8:16 ` Vincent Donnefort
2024-08-20 20:33 ` Hugh Dickins
1 sibling, 1 reply; 12+ messages in thread
From: Vincent Donnefort @ 2024-08-19 8:16 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andrew Morton, David Hildenbrand, Andrew Bresticker, linux-mm,
linux-kernel, maz, seanjc
On Tue, Jun 11, 2024 at 07:22:03PM +0100, Matthew Wilcox wrote:
> On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
> > On Tue, 11 Jun 2024 17:33:17 +0200 David Hildenbrand <david@redhat.com> wrote:
> >
> > > On 11.06.24 17:32, Andrew Bresticker wrote:
> > > > The requirement that the head page be passed to do_set_pmd() was added
> > > > in commit ef37b2ea08ac ("mm/memory: page_add_file_rmap() ->
> > > > folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
> > > > finish_fault() and filemap_map_pages() paths if the page to be inserted
> > > > is anything but the head page for an otherwise suitable vma and pmd-sized
> > > > page.
> > > >
> > > > Fixes: ef37b2ea08ac ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
> > > > Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> > > > ---
> > > > mm/memory.c | 3 ++-
> > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 0f47a533014e..a1fce5ddacb3 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -4614,8 +4614,9 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
> > > > return ret;
> > > >
> > > > - if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER)
> > > > + if (folio_order(folio) != HPAGE_PMD_ORDER)
> > > > return ret;
> > > > + page = &folio->page;
> > > >
> > > > /*
> > > > * Just backoff if any subpage of a THP is corrupted otherwise
> > >
> > > Acked-by: David Hildenbrand <david@redhat.com>
> >
> > You know what I'm going to ask ;) I'm assuming that the runtime effects
> > are "small performance optimization" and that "should we backport the
> > fix" is "no".
>
> We're going to stop using PMDs to map large folios unless the fault is
> within the first 4KiB of the PMD. No idea how many workloads that
> affects, but it only needs to be backported as far as v6.8, so we
> may as well backport it.
Hi, I am reviving this thread after noticing this comment attached
to the fix.
If you intend to install PTE level mappings for faults that happen outside of
the first 4KiB, I believe this will make THP support for KVM ineffective.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] mm/memory: Don't require head page for do_set_pmd()
2024-08-19 8:16 ` Vincent Donnefort
@ 2024-08-20 20:33 ` Hugh Dickins
0 siblings, 0 replies; 12+ messages in thread
From: Hugh Dickins @ 2024-08-20 20:33 UTC (permalink / raw)
To: Vincent Donnefort
Cc: Matthew Wilcox, Andrew Morton, David Hildenbrand,
Andrew Bresticker, linux-mm, linux-kernel, maz, seanjc
On Mon, 19 Aug 2024, Vincent Donnefort wrote:
> On Tue, Jun 11, 2024 at 07:22:03PM +0100, Matthew Wilcox wrote:
> > On Tue, Jun 11, 2024 at 11:06:22AM -0700, Andrew Morton wrote:
...
> > > You know what I'm going to ask ;) I'm assuming that the runtime effects
> > > are "small performance optimization" and that "should we backport the
> > > fix" is "no".
> >
> > We're going to stop using PMDs to map large folios unless the fault is
> > within the first 4KiB of the PMD. No idea how many workloads that
> > affects, but it only needs to be backported as far as v6.8, so we
> > may as well backport it.
>
> Hi, I am reviving this thread after noticing this comment attached
> to the fix.
>
> If you intend to install PTE level mappings for faults that happen outside of
> the first 4KiB, I believe this will make THP support for KVM ineffective.
You can relax, it's okay: where Matthew wrote "We're going to stop...",
he was describing the runtime effects of the bug (now fixed) to Andrew,
not proposing to make a change to mess up THP support.
The fix was backported to v6.9.N, but was too late for v6.8.N EOL.
Hugh
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-08-20 20:34 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-11 15:32 [PATCH v2] mm/memory: Don't require head page for do_set_pmd() Andrew Bresticker
2024-06-11 15:33 ` David Hildenbrand
2024-06-11 18:06 ` Andrew Morton
2024-06-11 18:22 ` Matthew Wilcox
2024-06-11 20:07 ` Andrew Morton
2024-06-11 21:18 ` Hugh Dickins
2024-06-12 18:41 ` David Hildenbrand
2024-08-19 8:16 ` Vincent Donnefort
2024-08-20 20:33 ` Hugh Dickins
2024-06-11 18:03 ` Matthew Wilcox
2024-06-11 18:21 ` Andrew Morton
2024-06-11 18:38 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox