From: Miaohe Lin <linmiaohe@huawei.com>
To: David Hildenbrand <david@redhat.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
<akpm@linux-foundation.org>, <osalvador@suse.de>,
<nao.horiguchi@gmail.com>, <mhocko@suse.com>,
Wupeng Ma <mawupeng1@huawei.com>
Subject: Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
Date: Wed, 22 Jan 2025 15:38:56 +0800 [thread overview]
Message-ID: <eebe218a-f381-3a24-36c5-e1472b91a1ce@huawei.com> (raw)
In-Reply-To: <2fc40750-2ff7-4e73-ba52-c4d9caaa4f4f@redhat.com>
On 2025/1/21 15:58, David Hildenbrand wrote:
> On 21.01.25 04:20, Miaohe Lin wrote:
>> On 2025/1/20 16:46, David Hildenbrand wrote:
>>> On 20.01.25 08:49, David Hildenbrand wrote:
>>>>
>>>>>> if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>>> struct address_space *mapping;
>>>>>> @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>> if (!mapping) {
>>>>>> pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>>> folio_pfn(folio));
>>>>>> - return;
>>>>>> + return -EBUSY;
>>>>>> }
>>>>>> try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>> } else {
>>>>>> try_to_unmap(folio, ttu);
>>>>>> }
>>>>>> +
>>>>>> + return folio_mapped(folio) ? -EBUSY : 0;
>>>>>
>>>>> Do we really need this return value? It's unused in do_migrate_range().
>>>>
>>>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>>>> It cleans up hwpoison_user_mappings() quite nicely.
>>>
>>> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.
>>
>> Do you mean TTU_SYNC? It seems it's not set.
>
> With your patch it will be now, which is the right thing to do I think.
>
>>
>> There might be a race will hit the proposed pr_warn_ratelimit():
>>
>> /* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
>> do_migrate_range
>> folio_test_hwpoison
>> folio_mapped
>> <folio is isolated for reclaim again.>
>> unmap_poisoned_folio
>> <folio is put buck.>
>> pr_warn_ratelimit(folio_mapped)
>>
>> But I might be miss something. And even this race is possible, it should be really hard to hit.
>
> Does try_to_unmap() care about isolation? Skimming over the code, I don't think so. I assume once we take the folio lock, races with reclaim are impossible.
I think you're right. I missed folio lock in above race.
>
> In any case, the race is unexpected, so pr_warn_() would be helpful and not harmful.
>
> Memory offlining code will later simply skip all PageHWPoison() pages, independent of the refcount as it seems. Failing to unmap might not be handled correctly at all ... I think this might be problematic in other regard (e.g., GUP references), but failing to unmap is "obviously" bad I think :)
Agree with you.
Thanks.
.
next prev parent reply other threads:[~2025-01-22 7:39 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-16 6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
2025-01-16 6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
2025-01-17 3:57 ` kernel test robot
2025-01-17 4:16 ` mawupeng
2025-01-17 4:39 ` kernel test robot
2025-01-17 4:49 ` kernel test robot
2025-01-20 6:24 ` Miaohe Lin
2025-01-20 7:49 ` David Hildenbrand
2025-01-20 8:46 ` David Hildenbrand
2025-01-21 3:20 ` Miaohe Lin
2025-01-21 7:58 ` David Hildenbrand
2025-01-22 7:38 ` Miaohe Lin [this message]
2025-01-21 2:46 ` Miaohe Lin
2025-01-20 9:06 ` mawupeng
2025-01-20 7:55 ` David Hildenbrand
2025-01-16 6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
2025-01-20 9:25 ` David Hildenbrand
2025-01-16 6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
2025-01-20 6:32 ` Miaohe Lin
2025-01-21 2:17 ` mawupeng
2025-01-20 8:01 ` David Hildenbrand
2025-01-20 9:11 ` mawupeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eebe218a-f381-3a24-36c5-e1472b91a1ce@huawei.com \
--to=linmiaohe@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawupeng1@huawei.com \
--cc=mhocko@suse.com \
--cc=nao.horiguchi@gmail.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox