* [PATCH v3 1/5] mm: add folio_mc_copy()
2024-05-28 13:45 [PATCH v3 0/5] mm: migrate: support poison recover from migrate folio Kefeng Wang
@ 2024-05-28 13:45 ` Kefeng Wang
2024-05-28 14:17 ` Lance Yang
2024-05-28 20:13 ` Jane Chu
2024-05-28 13:45 ` [PATCH v3 2/5] mm: migrate: split folio_migrate_mapping() Kefeng Wang
` (3 subsequent siblings)
4 siblings, 2 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-28 13:45 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Jane Chu, Oscar Salvador, Kefeng Wang
Add a #MC variant of folio_copy() which uses copy_mc_highpage() to
support #MC handled during folio copy, it will be used in folio
migration soon.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/mm.h | 1 +
mm/util.c | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 12bf20ebae5a..0c78d67f787d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1321,6 +1321,7 @@ void put_pages_list(struct list_head *pages);
void split_page(struct page *page, unsigned int order);
void folio_copy(struct folio *dst, struct folio *src);
+int folio_mc_copy(struct folio *dst, struct folio *src);
unsigned long nr_free_buffer_pages(void);
diff --git a/mm/util.c b/mm/util.c
index c9e519e6811f..9462dbf7ce02 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -828,6 +828,26 @@ void folio_copy(struct folio *dst, struct folio *src)
}
EXPORT_SYMBOL(folio_copy);
+int folio_mc_copy(struct folio *dst, struct folio *src)
+{
+ long nr = folio_nr_pages(src);
+ long i = 0;
+ int ret = 0;
+
+ for (;;) {
+ if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) {
+ ret = -EFAULT;
+ break;
+ }
+ if (++i == nr)
+ break;
+ cond_resched();
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL(folio_mc_copy);
+
int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
int sysctl_overcommit_ratio __read_mostly = 50;
unsigned long sysctl_overcommit_kbytes __read_mostly;
--
2.27.0
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 1/5] mm: add folio_mc_copy()
2024-05-28 13:45 ` [PATCH v3 1/5] mm: add folio_mc_copy() Kefeng Wang
@ 2024-05-28 14:17 ` Lance Yang
2024-05-29 2:16 ` Kefeng Wang
2024-05-28 20:13 ` Jane Chu
1 sibling, 1 reply; 18+ messages in thread
From: Lance Yang @ 2024-05-28 14:17 UTC (permalink / raw)
To: Kefeng Wang
Cc: akpm, linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi,
Matthew Wilcox, David Hildenbrand, Muchun Song, Benjamin LaHaise,
jglisse, Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola,
Alistair Popple, Jane Chu, Oscar Salvador
Hi Kefeng,
On Tue, May 28, 2024 at 9:45 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
> Add a #MC variant of folio_copy() which uses copy_mc_highpage() to
> support #MC handled during folio copy, it will be used in folio
> migration soon.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> include/linux/mm.h | 1 +
> mm/util.c | 20 ++++++++++++++++++++
> 2 files changed, 21 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 12bf20ebae5a..0c78d67f787d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1321,6 +1321,7 @@ void put_pages_list(struct list_head *pages);
>
> void split_page(struct page *page, unsigned int order);
> void folio_copy(struct folio *dst, struct folio *src);
> +int folio_mc_copy(struct folio *dst, struct folio *src);
>
> unsigned long nr_free_buffer_pages(void);
>
> diff --git a/mm/util.c b/mm/util.c
> index c9e519e6811f..9462dbf7ce02 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -828,6 +828,26 @@ void folio_copy(struct folio *dst, struct folio *src)
> }
> EXPORT_SYMBOL(folio_copy);
>
> +int folio_mc_copy(struct folio *dst, struct folio *src)
> +{
> + long nr = folio_nr_pages(src);
> + long i = 0;
> + int ret = 0;
> +
> + for (;;) {
> + if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) {
> + ret = -EFAULT;
> + break;
> + }
> + if (++i == nr)
> + break;
> + cond_resched();
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(folio_mc_copy);
The code below might be clearer:
+int folio_mc_copy(struct folio *dst, struct folio *src)
+{
+ long nr = folio_nr_pages(src);
+ long i = 0;
+
+ for (;;) {
+ if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i)))
+ return -EFAULT;
+ if (++i == nr)
+ break;
+ cond_resched();
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(folio_mc_copy);
IMO, we can return the error code directly without needing to use the 'ret'.
Thanks,
Lance
> +
> int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
> int sysctl_overcommit_ratio __read_mostly = 50;
> unsigned long sysctl_overcommit_kbytes __read_mostly;
> --
> 2.27.0
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 1/5] mm: add folio_mc_copy()
2024-05-28 14:17 ` Lance Yang
@ 2024-05-29 2:16 ` Kefeng Wang
0 siblings, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-29 2:16 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi,
Matthew Wilcox, David Hildenbrand, Muchun Song, Benjamin LaHaise,
jglisse, Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola,
Alistair Popple, Jane Chu, Oscar Salvador
On 2024/5/28 22:17, Lance Yang wrote:
> Hi Kefeng,
>
> On Tue, May 28, 2024 at 9:45 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>
>> Add a #MC variant of folio_copy() which uses copy_mc_highpage() to
>> support #MC handled during folio copy, it will be used in folio
>> migration soon.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> include/linux/mm.h | 1 +
>> mm/util.c | 20 ++++++++++++++++++++
>> 2 files changed, 21 insertions(+)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 12bf20ebae5a..0c78d67f787d 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1321,6 +1321,7 @@ void put_pages_list(struct list_head *pages);
>>
>> void split_page(struct page *page, unsigned int order);
>> void folio_copy(struct folio *dst, struct folio *src);
>> +int folio_mc_copy(struct folio *dst, struct folio *src);
>>
>> unsigned long nr_free_buffer_pages(void);
>>
>> diff --git a/mm/util.c b/mm/util.c
>> index c9e519e6811f..9462dbf7ce02 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -828,6 +828,26 @@ void folio_copy(struct folio *dst, struct folio *src)
>> }
>> EXPORT_SYMBOL(folio_copy);
>>
>> +int folio_mc_copy(struct folio *dst, struct folio *src)
>> +{
>> + long nr = folio_nr_pages(src);
>> + long i = 0;
>> + int ret = 0;
>> +
>> + for (;;) {
>> + if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) {
>> + ret = -EFAULT;
>> + break;
>> + }
>> + if (++i == nr)
>> + break;
>> + cond_resched();
>> + }
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(folio_mc_copy);
>
> The code below might be clearer:
>
> +int folio_mc_copy(struct folio *dst, struct folio *src)
> +{
> + long nr = folio_nr_pages(src);
> + long i = 0;
> +
> + for (;;) {
> + if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i)))
> + return -EFAULT;
> + if (++i == nr)
> + break;
> + cond_resched();
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(folio_mc_copy);
>
> IMO, we can return the error code directly without needing to use the 'ret'.
Clearer, will update, thanks
>
> Thanks,
> Lance
>
>> +
>> int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
>> int sysctl_overcommit_ratio __read_mostly = 50;
>> unsigned long sysctl_overcommit_kbytes __read_mostly;
>> --
>> 2.27.0
>>
>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/5] mm: add folio_mc_copy()
2024-05-28 13:45 ` [PATCH v3 1/5] mm: add folio_mc_copy() Kefeng Wang
2024-05-28 14:17 ` Lance Yang
@ 2024-05-28 20:13 ` Jane Chu
2024-05-29 2:16 ` Kefeng Wang
1 sibling, 1 reply; 18+ messages in thread
From: Jane Chu @ 2024-05-28 20:13 UTC (permalink / raw)
To: Kefeng Wang, akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Oscar Salvador
On 5/28/2024 6:45 AM, Kefeng Wang wrote:
>
> +int folio_mc_copy(struct folio *dst, struct folio *src)
> +{
> + long nr = folio_nr_pages(src);
> + long i = 0;
> + int ret = 0;
> +
> + for (;;) {
> + if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) {
> + ret = -EFAULT;
> + break;
> + }
Why not be more specific by returning -EHWPOISON ?
-jane
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 1/5] mm: add folio_mc_copy()
2024-05-28 20:13 ` Jane Chu
@ 2024-05-29 2:16 ` Kefeng Wang
0 siblings, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-29 2:16 UTC (permalink / raw)
To: Jane Chu, akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Oscar Salvador
On 2024/5/29 4:13, Jane Chu wrote:
> On 5/28/2024 6:45 AM, Kefeng Wang wrote:
>
>> +int folio_mc_copy(struct folio *dst, struct folio *src)
>> +{
>> + long nr = folio_nr_pages(src);
>> + long i = 0;
>> + int ret = 0;
>> +
>> + for (;;) {
>> + if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) {
>> + ret = -EFAULT;
>> + break;
>> + }
> Why not be more specific by returning -EHWPOISON ?
Yes, this is better, will update, thanks.
>
> -jane
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/5] mm: migrate: split folio_migrate_mapping()
2024-05-28 13:45 [PATCH v3 0/5] mm: migrate: support poison recover from migrate folio Kefeng Wang
2024-05-28 13:45 ` [PATCH v3 1/5] mm: add folio_mc_copy() Kefeng Wang
@ 2024-05-28 13:45 ` Kefeng Wang
2024-05-28 13:45 ` [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio Kefeng Wang
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-28 13:45 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Jane Chu, Oscar Salvador, Kefeng Wang
The folio refcount check for !mapping and folio_ref_freeze() for
mapping are moved out of the orignal folio_migrate_mapping(), and
there is no turning back for the new __folio_migrate_mapping(),
also update comment from page to folio.
Note, the folio_ref_freeze() is moved out of xas_lock_irq(),
Since the folio is already isolated and locked during migration,
so suppose that there is no functional change.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/migrate.c | 63 ++++++++++++++++++++++++++--------------------------
1 file changed, 32 insertions(+), 31 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index e04b451c4289..e930376c261a 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -393,50 +393,36 @@ static int folio_expected_refs(struct address_space *mapping,
}
/*
- * Replace the page in the mapping.
+ * Replace the folio in the mapping.
*
* The number of remaining references must be:
- * 1 for anonymous pages without a mapping
- * 2 for pages with a mapping
- * 3 for pages with a mapping and PagePrivate/PagePrivate2 set.
+ * 1 for anonymous folios without a mapping
+ * 2 for folios with a mapping
+ * 3 for folios with a mapping and PagePrivate/PagePrivate2 set.
*/
-int folio_migrate_mapping(struct address_space *mapping,
- struct folio *newfolio, struct folio *folio, int extra_count)
+static void __folio_migrate_mapping(struct address_space *mapping,
+ struct folio *newfolio, struct folio *folio, int expected_cnt)
{
XA_STATE(xas, &mapping->i_pages, folio_index(folio));
struct zone *oldzone, *newzone;
- int dirty;
- int expected_count = folio_expected_refs(mapping, folio) + extra_count;
long nr = folio_nr_pages(folio);
long entries, i;
+ int dirty;
if (!mapping) {
- /* Anonymous page without mapping */
- if (folio_ref_count(folio) != expected_count)
- return -EAGAIN;
-
- /* No turning back from here */
+ /* Anonymous folio without mapping */
newfolio->index = folio->index;
newfolio->mapping = folio->mapping;
if (folio_test_swapbacked(folio))
__folio_set_swapbacked(newfolio);
-
- return MIGRATEPAGE_SUCCESS;
+ return;
}
oldzone = folio_zone(folio);
newzone = folio_zone(newfolio);
xas_lock_irq(&xas);
- if (!folio_ref_freeze(folio, expected_count)) {
- xas_unlock_irq(&xas);
- return -EAGAIN;
- }
-
- /*
- * Now we know that no one else is looking at the folio:
- * no turning back from here.
- */
+ /* Now we know that no one else is looking at the folio */
newfolio->index = folio->index;
newfolio->mapping = folio->mapping;
folio_ref_add(newfolio, nr); /* add cache reference */
@@ -452,7 +438,7 @@ int folio_migrate_mapping(struct address_space *mapping,
entries = 1;
}
- /* Move dirty while page refs frozen and newpage not yet exposed */
+ /* Move dirty while folio refs frozen and newfolio not yet exposed */
dirty = folio_test_dirty(folio);
if (dirty) {
folio_clear_dirty(folio);
@@ -466,22 +452,22 @@ int folio_migrate_mapping(struct address_space *mapping,
}
/*
- * Drop cache reference from old page by unfreezing
- * to one less reference.
+ * Since old folio's refcount freezed, now drop cache reference from
+ * old folio by unfreezing to one less reference.
* We know this isn't the last reference.
*/
- folio_ref_unfreeze(folio, expected_count - nr);
+ folio_ref_unfreeze(folio, expected_cnt - nr);
xas_unlock(&xas);
/* Leave irq disabled to prevent preemption while updating stats */
/*
* If moved to a different zone then also account
- * the page for that zone. Other VM counters will be
+ * the folio for that zone. Other VM counters will be
* taken care of when we establish references to the
- * new page and drop references to the old page.
+ * new folio and drop references to the old folio.
*
- * Note that anonymous pages are accounted for
+ * Note that anonymous folios are accounted for
* via NR_FILE_PAGES and NR_ANON_MAPPED if they
* are mapped to swap space.
*/
@@ -518,7 +504,22 @@ int folio_migrate_mapping(struct address_space *mapping,
}
}
local_irq_enable();
+}
+
+int folio_migrate_mapping(struct address_space *mapping, struct folio *newfolio,
+ struct folio *folio, int extra_count)
+{
+ int expected_cnt = folio_expected_refs(mapping, folio) + extra_count;
+
+ if (!mapping) {
+ if (folio_ref_count(folio) != expected_cnt)
+ return -EAGAIN;
+ } else {
+ if (!folio_ref_freeze(folio, expected_cnt))
+ return -EAGAIN;
+ }
+ __folio_migrate_mapping(mapping, newfolio, folio, expected_cnt);
return MIGRATEPAGE_SUCCESS;
}
EXPORT_SYMBOL(folio_migrate_mapping);
--
2.27.0
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio
2024-05-28 13:45 [PATCH v3 0/5] mm: migrate: support poison recover from migrate folio Kefeng Wang
2024-05-28 13:45 ` [PATCH v3 1/5] mm: add folio_mc_copy() Kefeng Wang
2024-05-28 13:45 ` [PATCH v3 2/5] mm: migrate: split folio_migrate_mapping() Kefeng Wang
@ 2024-05-28 13:45 ` Kefeng Wang
2024-05-28 20:15 ` Jane Chu
2024-05-28 13:45 ` [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio() Kefeng Wang
2024-05-28 13:45 ` [PATCH v3 5/5] mm: migrate: remove folio_migrate_copy() Kefeng Wang
4 siblings, 1 reply; 18+ messages in thread
From: Kefeng Wang @ 2024-05-28 13:45 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Jane Chu, Oscar Salvador, Kefeng Wang
The folio migration is widely used in kernel, memory compaction, memory
hotplug, soft offline page, numa balance, memory demote/promotion, etc,
but once access a poisoned source folio when migrating, the kerenl will
panic.
There is a mechanism in the kernel to recover from uncorrectable memory
errors, ARCH_HAS_COPY_MC, which is already used in other core-mm paths,
eg, CoW, khugepaged, coredump, ksm copy, see copy_mc_to_{user,kernel},
copy_mc_{user_}highpage callers.
In order to support poisoned folio copy recover from migrate folio, we
chose to make folio migration tolerant of memory failures and return
error for folio migration, because folio migration is no guarantee
of success, this could avoid the similar panic shown below.
CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
pc : copy_page+0x10/0xc0
lr : copy_highpage+0x38/0x50
...
Call trace:
copy_page+0x10/0xc0
folio_copy+0x78/0x90
migrate_folio_extra+0x54/0xa0
move_to_new_folio+0xd8/0x1f0
migrate_folio_move+0xb8/0x300
migrate_pages_batch+0x528/0x788
migrate_pages_sync+0x8c/0x258
migrate_pages+0x440/0x528
soft_offline_in_use_page+0x2ec/0x3c0
soft_offline_page+0x238/0x310
soft_offline_page_store+0x6c/0xc0
dev_attr_store+0x20/0x40
sysfs_kf_write+0x4c/0x68
kernfs_fop_write_iter+0x130/0x1c8
new_sync_write+0xa4/0x138
vfs_write+0x238/0x2d8
ksys_write+0x74/0x110
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/migrate.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index e930376c261a..e8acc6c38574 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -663,16 +663,28 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst,
struct folio *src, void *src_private,
enum migrate_mode mode)
{
- int rc;
+ int expected_cnt = folio_expected_refs(mapping, src);
- rc = folio_migrate_mapping(mapping, dst, src, 0);
- if (rc != MIGRATEPAGE_SUCCESS)
- return rc;
+ if (!mapping) {
+ if (folio_ref_count(src) != expected_cnt)
+ return -EAGAIN;
+ } else {
+ if (!folio_ref_freeze(src, expected_cnt))
+ return -EAGAIN;
+ }
+
+ if (unlikely(folio_mc_copy(dst, src))) {
+ if (mapping)
+ folio_ref_unfreeze(src, expected_cnt);
+ return -EFAULT;
+ }
+
+ __folio_migrate_mapping(mapping, dst, src, expected_cnt);
if (src_private)
folio_attach_private(dst, folio_detach_private(src));
- folio_migrate_copy(dst, src);
+ folio_migrate_flags(dst, src);
return MIGRATEPAGE_SUCCESS;
}
--
2.27.0
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio
2024-05-28 13:45 ` [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio Kefeng Wang
@ 2024-05-28 20:15 ` Jane Chu
2024-05-29 2:19 ` Kefeng Wang
0 siblings, 1 reply; 18+ messages in thread
From: Jane Chu @ 2024-05-28 20:15 UTC (permalink / raw)
To: Kefeng Wang, akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Oscar Salvador
> +
> + if (unlikely(folio_mc_copy(dst, src))) {
> + if (mapping)
> + folio_ref_unfreeze(src, expected_cnt);
> + return -EFAULT;
> + }
Why not return what folio_mc_copy() returns?
Similar comment for patch 4/5 as well.
-jane
> +
> + __folio_migrate_mapping(mapping, dst, src, expected_cnt);
>
> if (src_private)
> folio_attach_private(dst, folio_detach_private(src));
>
> - folio_migrate_copy(dst, src);
> + folio_migrate_flags(dst, src);
> return MIGRATEPAGE_SUCCESS;
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio
2024-05-28 20:15 ` Jane Chu
@ 2024-05-29 2:19 ` Kefeng Wang
0 siblings, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-29 2:19 UTC (permalink / raw)
To: Jane Chu, akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Oscar Salvador
On 2024/5/29 4:15, Jane Chu wrote:
>> +
>> + if (unlikely(folio_mc_copy(dst, src))) {
>> + if (mapping)
>> + folio_ref_unfreeze(src, expected_cnt);
>> + return -EFAULT;
>> + }
>
> Why not return what folio_mc_copy() returns?
The old version has ret of folio_mc_copy(), but omit it in this
version, no strong option, I could add it back, thanks.
>
> Similar comment for patch 4/5 as well.
>
> -jane
>
>> +
>> + __folio_migrate_mapping(mapping, dst, src, expected_cnt);
>> if (src_private)
>> folio_attach_private(dst, folio_detach_private(src));
>> - folio_migrate_copy(dst, src);
>> + folio_migrate_flags(dst, src);
>> return MIGRATEPAGE_SUCCESS;
>> }
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 13:45 [PATCH v3 0/5] mm: migrate: support poison recover from migrate folio Kefeng Wang
` (2 preceding siblings ...)
2024-05-28 13:45 ` [PATCH v3 3/5] mm: migrate: support poisoned recover from migrate folio Kefeng Wang
@ 2024-05-28 13:45 ` Kefeng Wang
2024-05-28 15:41 ` Luck, Tony
2024-05-28 13:45 ` [PATCH v3 5/5] mm: migrate: remove folio_migrate_copy() Kefeng Wang
4 siblings, 1 reply; 18+ messages in thread
From: Kefeng Wang @ 2024-05-28 13:45 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Jane Chu, Oscar Salvador, Kefeng Wang
This is similar to __migrate_folio(), use folio_mc_copy() in HugeTLB
folio migration to avoid panic when copy from poisoned folio.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
fs/hugetlbfs/inode.c | 2 +-
mm/migrate.c | 13 ++++++++-----
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 6df794ed4066..1107e5aa8343 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1128,7 +1128,7 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
hugetlb_set_folio_subpool(src, NULL);
}
- folio_migrate_copy(dst, src);
+ folio_migrate_flags(dst, src);
return MIGRATEPAGE_SUCCESS;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index e8acc6c38574..78cfad677789 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -532,15 +532,18 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
struct folio *dst, struct folio *src)
{
XA_STATE(xas, &mapping->i_pages, folio_index(src));
- int expected_count;
+ int expected_count = folio_expected_refs(mapping, src);
- xas_lock_irq(&xas);
- expected_count = folio_expected_refs(mapping, src);
- if (!folio_ref_freeze(src, expected_count)) {
- xas_unlock_irq(&xas);
+ if (!folio_ref_freeze(src, expected_count))
return -EAGAIN;
+
+ if (unlikely(folio_mc_copy(dst, src))) {
+ folio_ref_unfreeze(src, expected_count);
+ return -EFAULT;
}
+ xas_lock_irq(&xas);
+
dst->index = src->index;
dst->mapping = src->mapping;
--
2.27.0
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 13:45 ` [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio() Kefeng Wang
@ 2024-05-28 15:41 ` Luck, Tony
2024-05-28 20:19 ` Jane Chu
2024-05-29 2:48 ` Kefeng Wang
0 siblings, 2 replies; 18+ messages in thread
From: Luck, Tony @ 2024-05-28 15:41 UTC (permalink / raw)
To: Kefeng Wang, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, chu, jane,
Oscar Salvador
+ if (unlikely(folio_mc_copy(dst, src))) {
+ folio_ref_unfreeze(src, expected_count);
+ return -EFAULT;
It doesn't look like any code takes action to avoid re-using the poisoned page.
So you survived, hurrah! But left the problem page for some other code to trip over.
-Tony
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 15:41 ` Luck, Tony
@ 2024-05-28 20:19 ` Jane Chu
2024-05-28 20:30 ` Luck, Tony
2024-05-29 2:48 ` Kefeng Wang
1 sibling, 1 reply; 18+ messages in thread
From: Jane Chu @ 2024-05-28 20:19 UTC (permalink / raw)
To: Luck, Tony, Kefeng Wang, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, Oscar Salvador
On 5/28/2024 8:41 AM, Luck, Tony wrote:
> + if (unlikely(folio_mc_copy(dst, src))) {
> + folio_ref_unfreeze(src, expected_count);
> + return -EFAULT;
>
> It doesn't look like any code takes action to avoid re-using the poisoned page.
>
> So you survived, hurrah! But left the problem page for some other code to trip over.
Tony, did you mean that memory_failure_queue() should be called? If
not, could you elaborate more?
thanks,
-jane
>
> -Tony
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 20:19 ` Jane Chu
@ 2024-05-28 20:30 ` Luck, Tony
2024-05-28 22:02 ` Jane Chu
0 siblings, 1 reply; 18+ messages in thread
From: Luck, Tony @ 2024-05-28 20:30 UTC (permalink / raw)
To: chu, jane, Kefeng Wang, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, Oscar Salvador
>> + if (unlikely(folio_mc_copy(dst, src))) {
>> + folio_ref_unfreeze(src, expected_count);
>> + return -EFAULT;
>>
>> It doesn't look like any code takes action to avoid re-using the poisoned page.
>>
>> So you survived, hurrah! But left the problem page for some other code to trip over.
>
> Tony, did you mean that memory_failure_queue() should be called? If
> not, could you elaborate more?
Maybe memory_failure_queue() can help here. Though it would need to know
which pfn inside the folio hit the poison. So some more infrastructure around the
copy to make sure the pfn is saved.
-Tony
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 20:30 ` Luck, Tony
@ 2024-05-28 22:02 ` Jane Chu
2024-05-28 22:10 ` Luck, Tony
0 siblings, 1 reply; 18+ messages in thread
From: Jane Chu @ 2024-05-28 22:02 UTC (permalink / raw)
To: Luck, Tony, Kefeng Wang, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, Oscar Salvador
On 5/28/2024 1:30 PM, Luck, Tony wrote:
>>> + if (unlikely(folio_mc_copy(dst, src))) {
>>> + folio_ref_unfreeze(src, expected_count);
>>> + return -EFAULT;
>>>
>>> It doesn't look like any code takes action to avoid re-using the poisoned page.
>>>
>>> So you survived, hurrah! But left the problem page for some other code to trip over.
>> Tony, did you mean that memory_failure_queue() should be called? If
>> not, could you elaborate more?
> Maybe memory_failure_queue() can help here. Though it would need to know
> which pfn inside the folio hit the poison. So some more infrastructure around the
> copy to make sure the pfn is saved.
It looks like memory_failure_queue() is only invoked when source user
page has a poison, not if the poison
is in source kernel page. Any idea why?
thanks!
-jane
> -Tony
>
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 22:02 ` Jane Chu
@ 2024-05-28 22:10 ` Luck, Tony
0 siblings, 0 replies; 18+ messages in thread
From: Luck, Tony @ 2024-05-28 22:10 UTC (permalink / raw)
To: chu, jane, Kefeng Wang, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, Oscar Salvador
> It looks like memory_failure_queue() is only invoked when source user
> page has a poison, not if the poison
>
> is in source kernel page. Any idea why?
If it is a user page, Linux can unmap it from that process. Send
signal(SIGBUS) to the process if it was actively consuming the
poison (as opposed to poison found by patrol scrubber or by
speculative access).
Linux doesn't (yet) have a way to workaround poison in a source
kernel page. I think the only place it will ever get that is for pages
in the page cache. Action to somehow unmap them from the file?
-Tony
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
2024-05-28 15:41 ` Luck, Tony
2024-05-28 20:19 ` Jane Chu
@ 2024-05-29 2:48 ` Kefeng Wang
1 sibling, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-29 2:48 UTC (permalink / raw)
To: Luck, Tony, akpm, linux-mm
Cc: Miaohe Lin, nao.horiguchi, Matthew Wilcox, David Hildenbrand,
Muchun Song, Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan,
Hugh Dickins, Vishal Moola, Alistair Popple, chu, jane,
Oscar Salvador
On 2024/5/28 23:41, Luck, Tony wrote:
> + if (unlikely(folio_mc_copy(dst, src))) {
> + folio_ref_unfreeze(src, expected_count);
> + return -EFAULT;
>
> It doesn't look like any code takes action to avoid re-using the poisoned page.
>
> So you survived, hurrah! But left the problem page for some other code to trip over.
Hi Tony, thanks for your review,
We tried to avoid calling memory_failure_queue() after
copy_mc_{user_}highpage(), and I think the memory_failure() should be
called by ARCH's code(eg, mce in x86)[1] to handle the poisoned page,
but for current mainline, the x86 mce don't do that, so yes, we need a
memory_failure_queue() for x86, but it is not true for upcoming
arm64, the poisoned page is handled by apei_claim_sea(),and a new
memory_failure_queue() is unnecessary(no issue since the
TestSetPageHWPoison() check in memory_failure()).
It seems that the khugepaged[3][4] should do the same thing, we could
call memory_failure_queue() in copy_mc_{user_}highpage(), and remove
it from each caller, is that OK?
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 00341b56d291..6b0d6f3c8580 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -352,6 +352,9 @@ static inline int copy_mc_user_highpage(struct page
*to, struct page *from,
kunmap_local(vto);
kunmap_local(vfrom);
+ if (ret)
+ memory_failure_queue(page_to_pfn(from), 0);
+
return ret;
}
@@ -368,6 +371,9 @@ static inline int copy_mc_highpage(struct page *to,
struct page *from)
kunmap_local(vto);
kunmap_local(vfrom);
+ if (ret)
+ memory_failure_queue(page_to_pfn(from), 0);
+
return ret;
}
Thanks.
[1]
https://lore.kernel.org/linux-mm/20240204082627.3892816-3-tongtiangen@huawei.com/
[2]
https://lore.kernel.org/linux-mm/20240528085915.1955987-1-tongtiangen@huawei.com/
[3] 12904d953364 mm/khugepaged: recover from poisoned file-backed memory
[4] 6efc7afb5cc9 mm/hwpoison: introduce copy_mc_highpage
>
> -Tony
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 5/5] mm: migrate: remove folio_migrate_copy()
2024-05-28 13:45 [PATCH v3 0/5] mm: migrate: support poison recover from migrate folio Kefeng Wang
` (3 preceding siblings ...)
2024-05-28 13:45 ` [PATCH v3 4/5] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio() Kefeng Wang
@ 2024-05-28 13:45 ` Kefeng Wang
4 siblings, 0 replies; 18+ messages in thread
From: Kefeng Wang @ 2024-05-28 13:45 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
Zi Yan, Jiaqi Yan, Hugh Dickins, Vishal Moola, Alistair Popple,
Jane Chu, Oscar Salvador, Kefeng Wang
The folio_migrate_copy() is just a wrapper of folio_copy() and
folio_migrate_flags(), it is simple and only aio use it for now,
unfold it and remove folio_migrate_copy().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
fs/aio.c | 3 ++-
include/linux/migrate.h | 1 -
mm/migrate.c | 7 -------
3 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 07ff8bbdcd2a..bcee11fcb08b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -455,7 +455,8 @@ static int aio_migrate_folio(struct address_space *mapping, struct folio *dst,
* events from being lost.
*/
spin_lock_irqsave(&ctx->completion_lock, flags);
- folio_migrate_copy(dst, src);
+ folio_copy(dst, src);
+ folio_migrate_flags(dst, src);
BUG_ON(ctx->ring_folios[idx] != src);
ctx->ring_folios[idx] = dst;
spin_unlock_irqrestore(&ctx->completion_lock, flags);
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 517f70b70620..f9d92482d117 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -76,7 +76,6 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
void migration_entry_wait_on_locked(swp_entry_t entry, spinlock_t *ptl)
__releases(ptl);
void folio_migrate_flags(struct folio *newfolio, struct folio *folio);
-void folio_migrate_copy(struct folio *newfolio, struct folio *folio);
int folio_migrate_mapping(struct address_space *mapping,
struct folio *newfolio, struct folio *folio, int extra_count);
diff --git a/mm/migrate.c b/mm/migrate.c
index 78cfad677789..6404239938be 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -651,13 +651,6 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
}
EXPORT_SYMBOL(folio_migrate_flags);
-void folio_migrate_copy(struct folio *newfolio, struct folio *folio)
-{
- folio_copy(newfolio, folio);
- folio_migrate_flags(newfolio, folio);
-}
-EXPORT_SYMBOL(folio_migrate_copy);
-
/************************************************************
* Migration functions
***********************************************************/
--
2.27.0
^ permalink raw reply [flat|nested] 18+ messages in thread