linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>
Cc: Tony Luck <tony.luck@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Muchun Song <muchun.song@linux.dev>,
	Benjamin LaHaise <bcrl@kvack.org>, <jglisse@redhat.com>,
	<linux-aio@kvack.org>, <linux-fsdevel@vger.kernel.org>,
	Zi Yan <ziy@nvidia.com>, Jiaqi Yan <jiaqiyan@google.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v1 00/11] mm: migrate: support poison recover from migrate folio
Date: Thu, 28 Mar 2024 21:30:36 +0800	[thread overview]
Message-ID: <1c8b52d2-485f-4972-aa46-0493b18186f9@huawei.com> (raw)
In-Reply-To: <20240321032747.87694-1-wangkefeng.wang@huawei.com>

Hi, since rfcv2, there is no more changes, kindly ping, any comments, 
thanks all.

On 2024/3/21 11:27, Kefeng Wang wrote:
> The folio migration is widely used in kernel, memory compaction, memory
> hotplug, soft offline page, numa balance, memory demote/promotion, etc,
> but once access a poisoned source folio when migrating, the kerenl will
> panic.
> 
> There is a mechanism in the kernel to recover from uncorrectable memory
> errors, ARCH_HAS_COPY_MC(Machine Check Safe Memory Copy), which is already
> used in NVDIMM or core-mm paths(eg, CoW, khugepaged, coredump, ksm copy),
> see copy_mc_to_{user,kernel}, copy_mc_{user_}highpage callers.
> 
> This series of patches provide the recovery mechanism from folio copy for
> the widely used folio migration. Please note, because folio migration is
> no guarantee of success, so we could chose to make folio migration tolerant
> of memory failures, adding folio_mc_copy() which is a #MC versions of
> folio_copy(), once accessing a poisoned source folio, we could return error
> and make the folio migration fail, and this could avoid the similar panic
> shown below.
> 
>    CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
>    pc : copy_page+0x10/0xc0
>    lr : copy_highpage+0x38/0x50
>    ...
>    Call trace:
>     copy_page+0x10/0xc0
>     folio_copy+0x78/0x90
>     migrate_folio_extra+0x54/0xa0
>     move_to_new_folio+0xd8/0x1f0
>     migrate_folio_move+0xb8/0x300
>     migrate_pages_batch+0x528/0x788
>     migrate_pages_sync+0x8c/0x258
>     migrate_pages+0x440/0x528
>     soft_offline_in_use_page+0x2ec/0x3c0
>     soft_offline_page+0x238/0x310
>     soft_offline_page_store+0x6c/0xc0
>     dev_attr_store+0x20/0x40
>     sysfs_kf_write+0x4c/0x68
>     kernfs_fop_write_iter+0x130/0x1c8
>     new_sync_write+0xa4/0x138
>     vfs_write+0x238/0x2d8
>     ksys_write+0x74/0x110
> 
> v1:
> - no change, resend and rebased on 6.9-rc1
> 
> rfcv2:
> - Separate __migrate_device_pages() cleanup from patch "remove
>    migrate_folio_extra()", suggested by Matthew
> - Split folio_migrate_mapping(), move refcount check/freeze out
>    of folio_migrate_mapping(), suggested by Matthew
> - add RB
> 
> Kefeng Wang (11):
>    mm: migrate: simplify __buffer_migrate_folio()
>    mm: migrate_device: use more folio in __migrate_device_pages()
>    mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY
>    mm: migrate: remove migrate_folio_extra()
>    mm: remove MIGRATE_SYNC_NO_COPY mode
>    mm: migrate: split folio_migrate_mapping()
>    mm: add folio_mc_copy()
>    mm: migrate: support poisoned recover from migrate folio
>    fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio()
>    mm: migrate: remove folio_migrate_copy()
>    fs: aio: add explicit check for large folio in aio_migrate_folio()
> 
>   fs/aio.c                     |  15 ++--
>   fs/hugetlbfs/inode.c         |   5 +-
>   include/linux/migrate.h      |   3 -
>   include/linux/migrate_mode.h |   5 --
>   include/linux/mm.h           |   1 +
>   mm/balloon_compaction.c      |   8 --
>   mm/migrate.c                 | 157 +++++++++++++++++------------------
>   mm/migrate_device.c          |  28 +++----
>   mm/util.c                    |  20 +++++
>   mm/zsmalloc.c                |   8 --
>   10 files changed, 115 insertions(+), 135 deletions(-)
> 


      parent reply	other threads:[~2024-03-28 13:30 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21  3:27 Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 01/11] mm: migrate: simplify __buffer_migrate_folio() Kefeng Wang
2024-04-01 17:54   ` Vishal Moola
2024-04-16  9:28   ` Miaohe Lin
2024-03-21  3:27 ` [PATCH v1 02/11] mm: migrate_device: use more folio in __migrate_device_pages() Kefeng Wang
2024-04-01 18:22   ` Vishal Moola
2024-04-02  6:21     ` Kefeng Wang
2024-04-02 15:54       ` Vishal Moola
2024-04-03  1:23         ` Kefeng Wang
2024-04-16 12:13   ` Miaohe Lin
2024-03-21  3:27 ` [PATCH v1 03/11] mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 04/11] mm: migrate: remove migrate_folio_extra() Kefeng Wang
2024-04-16 12:40   ` Miaohe Lin
2024-04-17  1:43     ` Kefeng Wang
2024-04-18  2:32       ` Miaohe Lin
2024-03-21  3:27 ` [PATCH v1 05/11] mm: remove MIGRATE_SYNC_NO_COPY mode Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 06/11] mm: migrate: split folio_migrate_mapping() Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 07/11] mm: add folio_mc_copy() Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 08/11] mm: migrate: support poisoned recover from migrate folio Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 09/11] fs: hugetlbfs: support poison recover from hugetlbfs_migrate_folio() Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 10/11] mm: migrate: remove folio_migrate_copy() Kefeng Wang
2024-03-21  3:27 ` [PATCH v1 11/11] fs: aio: add explicit check for large folio in aio_migrate_folio() Kefeng Wang
2024-03-21  3:35   ` Matthew Wilcox
2024-03-21  5:40     ` Kefeng Wang
2024-03-28 13:30 ` Kefeng Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c8b52d2-485f-4972-aa46-0493b18186f9@huawei.com \
    --to=wangkefeng.wang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=bcrl@kvack.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=jiaqiyan@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=naoya.horiguchi@nec.com \
    --cc=tony.luck@intel.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox