From: Miaohe Lin <linmiaohe@huawei.com>
To: Wupeng Ma <mawupeng1@huawei.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
<akpm@linux-foundation.org>, <david@redhat.com>,
<osalvador@suse.de>, <nao.horiguchi@gmail.com>, <mhocko@suse.com>
Subject: Re: [PATCH v3 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
Date: Wed, 19 Feb 2025 10:50:59 +0800 [thread overview]
Message-ID: <55e4ad74-752b-65c6-5ceb-b3a7fd7959a1@huawei.com> (raw)
In-Reply-To: <20250217014329.3610326-2-mawupeng1@huawei.com>
On 2025/2/17 9:43, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>
>
> Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
> TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
> in order to stop send SIGBUS signal when accessing an error page after
> a memory error on a clean folio. However during page migration, anon
> folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
> we need some policy just like the one in hwpoison_user_mappings to
> set this flag. So move this policy from hwpoison_user_mappings to
> unmap_poisoned_folio to handle this warning properly.
>
> Warning will be produced during unamp poison folio with the following log:
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c
> Modules linked in:
> CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G W 6.13.0-rc1-00018-gacdb4bbda7ab #42
> Tainted: [W]=WARN
> Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : try_to_unmap_one+0x8fc/0xd3c
> lr : try_to_unmap_one+0x3dc/0xd3c
> Call trace:
> try_to_unmap_one+0x8fc/0xd3c (P)
> try_to_unmap_one+0x3dc/0xd3c (L)
> rmap_walk_anon+0xdc/0x1f8
> rmap_walk+0x3c/0x58
> try_to_unmap+0x88/0x90
> unmap_poisoned_folio+0x30/0xa8
> do_migrate_range+0x4a0/0x568
> offline_pages+0x5a4/0x670
> memory_block_action+0x17c/0x374
> memory_subsys_offline+0x3c/0x78
> device_offline+0xa4/0xd0
> state_store+0x8c/0xf0
> dev_attr_store+0x18/0x2c
> sysfs_kf_write+0x44/0x54
> kernfs_fop_write_iter+0x118/0x1a8
> vfs_write+0x3a8/0x4bc
> ksys_write+0x6c/0xf8
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x44/0x100
> el0_svc_common.constprop.0+0x40/0xe0
> do_el0_svc+0x1c/0x28
> el0_svc+0x30/0xd0
> el0t_64_sync_handler+0xc8/0xcc
> el0t_64_sync+0x198/0x19c
> ---[ end trace 0000000000000000 ]---
>
> Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON")
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
> Acked-by: David Hildenbrand <david@redhat.com>
Thanks. LGTM. One nit below.
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
> mm/internal.h | 5 ++--
> mm/memory-failure.c | 61 +++++++++++++++++++++++----------------------
> mm/memory_hotplug.c | 3 ++-
> 3 files changed, 36 insertions(+), 33 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 9826f7dce607..c9186ca8d7c2 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1102,7 +1102,7 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask)
> * mm/memory-failure.c
> */
> #ifdef CONFIG_MEMORY_FAILURE
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu);
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
> void shake_folio(struct folio *folio);
> extern int hwpoison_filter(struct page *p);
>
> @@ -1125,8 +1125,9 @@ unsigned long page_mapped_in_vma(const struct page *page,
> struct vm_area_struct *vma);
>
> #else
> -static inline void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
> {
> + return -EBUSY;
> }
> #endif
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index a7b8ccd29b6f..b5212b6e330a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1556,8 +1556,34 @@ static int get_hwpoison_page(struct page *p, unsigned long flags)
> return ret;
> }
>
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
> {
> + enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
> + struct address_space *mapping;
> +
> + if (folio_test_swapcache(folio)) {
> + pr_err("%#lx: keeping poisoned page in swap cache\n", pfn);
> + ttu &= ~TTU_HWPOISON;
> + }
> +
> + /*
> + * Propagate the dirty bit from PTEs to struct page first, because we
> + * need this to decide if we should kill or just drop the page.
> + * XXX: the dirty test could be racy: set_page_dirty() may not always
> + * be called inside page lock (it's recommended but not enforced).
> + */
> + mapping = folio_mapping(folio);
> + if (!must_kill && !folio_test_dirty(folio) && mapping &&
> + mapping_can_writeback(mapping)) {
> + if (folio_mkclean(folio)) {
> + folio_set_dirty(folio);
> + } else {
> + ttu &= ~TTU_HWPOISON;
> + pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
> + pfn);
> + }
> + }
> +
> if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
> struct address_space *mapping;
This mapping can be removed as we have already defined one above. But this is trivial.
Thanks.
.
next prev parent reply other threads:[~2025-02-19 2:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 1:43 [PATCH v3 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
2025-02-17 1:43 ` [PATCH v3 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
2025-02-19 2:50 ` Miaohe Lin [this message]
2025-02-19 3:34 ` mawupeng
2025-02-19 6:06 ` [PATCH v3] " Wupeng Ma
2025-02-19 23:41 ` Andrew Morton
2025-03-19 17:21 ` Arthur Marsh
2025-03-20 2:40 ` mawupeng
2025-02-17 1:43 ` [PATCH v3 2/3] mm: memory-hotplug: check folio ref count first in do_migrate_range Wupeng Ma
2025-02-17 9:30 ` David Hildenbrand
2025-02-19 3:15 ` Miaohe Lin
2025-02-17 1:43 ` [PATCH v3 3/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
2025-02-19 3:17 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55e4ad74-752b-65c6-5ceb-b3a7fd7959a1@huawei.com \
--to=linmiaohe@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawupeng1@huawei.com \
--cc=mhocko@suse.com \
--cc=nao.horiguchi@gmail.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox