From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>
Cc: Tony Luck <tony.luck@intel.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Matthew Wilcox <willy@infradead.org>,
David Hildenbrand <david@redhat.com>,
Muchun Song <muchun.song@linux.dev>,
Benjamin LaHaise <bcrl@kvack.org>, <jglisse@redhat.com>,
<linux-aio@kvack.org>, <linux-fsdevel@vger.kernel.org>,
Zi Yan <ziy@nvidia.com>, Jiaqi Yan <jiaqiyan@google.com>,
Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v1 00/11] mm: migrate: support poison recover from migrate folio
Date: Thu, 28 Mar 2024 21:30:36 +0800 [thread overview]
Message-ID: <1c8b52d2-485f-4972-aa46-0493b18186f9@huawei.com> (raw)
In-Reply-To: <20240321032747.87694-1-wangkefeng.wang@huawei.com>
Hi, since rfcv2, there is no more changes, kindly ping, any comments,
thanks all.
On 2024/3/21 11:27, Kefeng Wang wrote:
> The folio migration is widely used in kernel, memory compaction, memory
> hotplug, soft offline page, numa balance, memory demote/promotion, etc,
> but once access a poisoned source folio when migrating, the kerenl will
> panic.
>
> There is a mechanism in the kernel to recover from uncorrectable memory
> errors, ARCH_HAS_COPY_MC(Machine Check Safe Memory Copy), which is already
> used in NVDIMM or core-mm paths(eg, CoW, khugepaged, coredump, ksm copy),
> see copy_mc_to_{user,kernel}, copy_mc_{user_}highpage callers.
>
> This series of patches provide the recovery mechanism from folio copy for
> the widely used folio migration. Please note, because folio migration is
> no guarantee of success, so we could chose to make folio migration tolerant
> of memory failures, adding folio_mc_copy() which is a #MC versions of
> folio_copy(), once accessing a poisoned source folio, we could return error
> and make the folio migration fail, and this could avoid the similar panic
> shown below.
>
> CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
> pc : copy_page+0x10/0xc0
> lr : copy_highpage+0x38/0x50
> ...
> Call trace:
> copy_page+0x10/0xc0
> folio_copy+0x78/0x90
> migrate_folio_extra+0x54/0xa0
> move_to_new_folio+0xd8/0x1f0
> migrate_folio_move+0xb8/0x300
> migrate_pages_batch+0x528/0x788
> migrate_pages_sync+0x8c/0x258
> migrate_pages+0x440/0x528
> soft_offline_in_use_page+0x2ec/0x3c0
> soft_offline_page+0x238/0x310
> soft_offline_page_store+0x6c/0xc0
> dev_attr_store+0x20/0x40
> sysfs_kf_write+0x4c/0x68
> kernfs_fop_write_iter+0x130/0x1c8
> new_sync_write+0xa4/0x138
> vfs_write+0x238/0x2d8
> ksys_write+0x74/0x110
>
> v1:
> - no change, resend and rebased on 6.9-rc1
>
> rfcv2:
> - Separate __migrate_device_pages() cleanup from patch "remove
> migrate_folio_extra()", suggested by Matthew
> - Split folio_migrate_mapping(), move refcount check/freeze out
> of folio_migrate_mapping(), suggested by Matthew
> - add RB
>
> Kefeng Wang (11):
> mm: migrate: simplify __buffer_migrate_folio()
> mm: migrate_device: use more folio in __migrate_device_pages()
> mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY
> mm: migrate: remove migrate_folio_extra()
> mm: remove MIGRATE_SYNC_NO_COPY mode
> mm: migrate: split folio_migrate_mapping()
> mm: add folio_mc_copy()
> mm: migrate: support poisoned recover from migrate folio
> fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
> mm: migrate: remove folio_migrate_copy()
> fs: aio: add explicit check for large folio in aio_migrate_folio()
>
> fs/aio.c | 15 ++--
> fs/hugetlbfs/inode.c | 5 +-
> include/linux/migrate.h | 3 -
> include/linux/migrate_mode.h | 5 --
> include/linux/mm.h | 1 +
> mm/balloon_compaction.c | 8 --
> mm/migrate.c | 157 +++++++++++++++++------------------
> mm/migrate_device.c | 28 +++----
> mm/util.c | 20 +++++
> mm/zsmalloc.c | 8 --
> 10 files changed, 115 insertions(+), 135 deletions(-)
>
prev parent reply other threads:[~2024-03-28 13:30 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 3:27 Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 01/11] mm: migrate: simplify __buffer_migrate_folio() Kefeng Wang
2024-04-01 17:54 ` Vishal Moola
2024-04-16 9:28 ` Miaohe Lin
2024-03-21 3:27 ` [PATCH v1 02/11] mm: migrate_device: use more folio in __migrate_device_pages() Kefeng Wang
2024-04-01 18:22 ` Vishal Moola
2024-04-02 6:21 ` Kefeng Wang
2024-04-02 15:54 ` Vishal Moola
2024-04-03 1:23 ` Kefeng Wang
2024-04-16 12:13 ` Miaohe Lin
2024-03-21 3:27 ` [PATCH v1 03/11] mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 04/11] mm: migrate: remove migrate_folio_extra() Kefeng Wang
2024-04-16 12:40 ` Miaohe Lin
2024-04-17 1:43 ` Kefeng Wang
2024-04-18 2:32 ` Miaohe Lin
2024-03-21 3:27 ` [PATCH v1 05/11] mm: remove MIGRATE_SYNC_NO_COPY mode Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 06/11] mm: migrate: split folio_migrate_mapping() Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 07/11] mm: add folio_mc_copy() Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 08/11] mm: migrate: support poisoned recover from migrate folio Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 09/11] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio() Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 10/11] mm: migrate: remove folio_migrate_copy() Kefeng Wang
2024-03-21 3:27 ` [PATCH v1 11/11] fs: aio: add explicit check for large folio in aio_migrate_folio() Kefeng Wang
2024-03-21 3:35 ` Matthew Wilcox
2024-03-21 5:40 ` Kefeng Wang
2024-03-28 13:30 ` Kefeng Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1c8b52d2-485f-4972-aa46-0493b18186f9@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=bcrl@kvack.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jglisse@redhat.com \
--cc=jiaqiyan@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=naoya.horiguchi@nec.com \
--cc=tony.luck@intel.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox