From: Miaohe Lin <linmiaohe@huawei.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
David Hildenbrand <david@redhat.com>,
Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Oscar Salvador <osalvador@suse.de>,
Naoya Horiguchi <nao.horiguchi@gmail.com>, <linux-mm@kvack.org>,
<dan.carpenter@linaro.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v3 1/5] mm: memory_hotplug: remove head variable in do_migrate_range()
Date: Sun, 29 Sep 2024 10:19:05 +0800 [thread overview]
Message-ID: <986bca05-7fd9-49ac-9129-934f31c28af6@huawei.com> (raw)
In-Reply-To: <841eb150-fac6-461e-808f-e6ae607c7d81@huawei.com>
On 2024/9/29 10:04, Kefeng Wang wrote:
>
>
> On 2024/9/29 9:16, Miaohe Lin wrote:
>> On 2024/9/28 16:39, David Hildenbrand wrote:
>>> On 28.09.24 10:34, David Hildenbrand wrote:
>>>> On 28.09.24 06:55, Matthew Wilcox wrote:
>>>>> On Tue, Aug 27, 2024 at 07:47:24PM +0800, Kefeng Wang wrote:
>>>>>> Directly use a folio for HugeTLB and THP when calculate the next pfn, then
>>>>>> remove unused head variable.
>>>>>
>>>>> I just noticed this got merged. You're going to hit BUG_ON with it.
>>>>>
>>>>>> - if (PageHuge(page)) {
>>>>>> - pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>>>>> - isolate_hugetlb(folio, &source);
>>>>>> - continue;
>>>>>> - } else if (PageTransHuge(page))
>>>>>> - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>>>>>> + /*
>>>>>> + * No reference or lock is held on the folio, so it might
>>>>>> + * be modified concurrently (e.g. split). As such,
>>>>>> + * folio_nr_pages() may read garbage. This is fine as the outer
>>>>>> + * loop will revisit the split folio later.
>>>>>> + */
>>>>>> + if (folio_test_large(folio)) {
>>>>>
>>>>> But it's not fine. Look at the implementation of folio_test_large():
>>>>>
>>>>> static inline bool folio_test_large(const struct folio *folio)
>>>>> {
>>>>> return folio_test_head(folio);
>>>>> }
>>>>>
>>>>> That's going to be provided by:
>>>>>
>>>>> #define FOLIO_TEST_FLAG(name, page) \
>>>>> static __always_inline bool folio_test_##name(const struct folio *folio) \
>>>>> { return test_bit(PG_##name, const_folio_flags(folio, page)); }
>>>>>
>>>>> and here's the BUG:
>>>>>
>>>>> static const unsigned long *const_folio_flags(const struct folio *folio,
>>>>> unsigned n)
>>>>> {
>>>>> const struct page *page = &folio->page;
>>>>>
>>>>> VM_BUG_ON_PGFLAGS(PageTail(page), page);
>>>>> VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
>>>>> return &page[n].flags;
>>>>> }
>>>>>
>>>>> (this page can be transformed from a head page to a tail page because,
>>>>> as the comment notes, we don't hold a reference.
>>>>>
>>>>> Please back this out.
>>>>
>>>> Should we generalize the approach in dump_folio() to locally copy a
>>>> folio, so we can safely perform checks before deciding whether we want
>>>> to try grabbing a reference on the real folio (if it's still a folio :) )?
>>>>
>>>
>>> Oh, and I forgot: isn't the existing code already racy?
>>>
>>> PageTransHuge() -> VM_BUG_ON_PAGE(PageTail(page), page);
>
> Yes, in v1[1], I asked same question for existing code for PageTransHuge(page),
>
> "If the page is a tail page, we will BUG_ON(DEBUG_VM enabled) here,
> but it seems that we don't guarantee the page won't be a tail page."
>
>
> we could delay the calculation after we got a ref, but the traversal of pfn may slow down a little if hint a tail pfn, is it acceptable?
>
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1786,15 +1786,6 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> page = pfn_to_page(pfn);
> folio = page_folio(page);
>
> - /*
> - * No reference or lock is held on the folio, so it might
> - * be modified concurrently (e.g. split). As such,
> - * folio_nr_pages() may read garbage. This is fine as the outer
> - * loop will revisit the split folio later.
> - */
> - if (folio_test_large(folio))
> - pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> -
> /*
> * HWPoison pages have elevated reference counts so the migration would
> * fail on them. It also doesn't make any sense to migrate them in the
> @@ -1807,6 +1798,8 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> folio_isolate_lru(folio);
> if (folio_mapped(folio))
> unmap_poisoned_folio(folio, TTU_IGNORE_MLOCK);
> + if (folio_test_large(folio))
> + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> continue;
> }
>
> @@ -1823,6 +1816,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> dump_page(page, "isolation failed");
> }
> }
> +
> + if (folio_test_large(folio))
> + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> put_folio:
> folio_put(folio);
> }
>
>
>>
>> do_migrate_range is called after start_isolate_page_range(). So a page might not be able to
>> transform from a head page to a tail page as it's isolated?
> start_isolate_page_range() is only isolate free pages, so maybe irrelevant.
A page transform from a head page to a tail page should through the below steps:
1. The compound page is freed into buddy.
2. It's merged into larger order in buddy.
3. It's allocated as a larger order compound page.
Since it is isolated, I think step 2 or 3 cannot happen. Or am I miss something?
Thanks.
next prev parent reply other threads:[~2024-09-29 2:19 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-27 11:47 [PATCH v3 0/5] mm: memory_hotplug: improve do_migrate_range() Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 1/5] mm: memory_hotplug: remove head variable in do_migrate_range() Kefeng Wang
2024-09-28 4:55 ` Matthew Wilcox
2024-09-28 8:34 ` David Hildenbrand
2024-09-28 8:39 ` David Hildenbrand
2024-09-29 1:16 ` Miaohe Lin
2024-09-29 2:04 ` Kefeng Wang
2024-09-29 2:19 ` Miaohe Lin [this message]
2024-09-30 9:25 ` David Hildenbrand
2024-10-09 7:27 ` Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 2/5] mm: memory-failure: add unmap_poisoned_folio() Kefeng Wang
2024-08-31 8:16 ` Miaohe Lin
2024-08-27 11:47 ` [PATCH v3 3/5] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range() Kefeng Wang
2024-08-31 8:36 ` Miaohe Lin
2024-08-27 11:47 ` [PATCH v3 4/5] mm: migrate: add isolate_folio_to_list() Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 5/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation Kefeng Wang
2024-08-29 15:05 ` [PATCH v3 5-fix/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation fix Kefeng Wang
2024-08-29 15:19 ` David Hildenbrand
2024-08-30 1:23 ` Kefeng Wang
2024-08-31 9:01 ` [PATCH v3 5/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=986bca05-7fd9-49ac-9129-934f31c28af6@huawei.com \
--to=linmiaohe@huawei.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=dan.carpenter@linaro.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=nao.horiguchi@gmail.com \
--cc=osalvador@suse.de \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox