linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	linux-mm@kvack.org, dan.carpenter@linaro.org,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v3 1/5] mm: memory_hotplug: remove head variable in do_migrate_range()
Date: Mon, 30 Sep 2024 11:25:08 +0200	[thread overview]
Message-ID: <bd98e0fc-949d-4850-89f5-de1b1c4995dd@redhat.com> (raw)
In-Reply-To: <986bca05-7fd9-49ac-9129-934f31c28af6@huawei.com>

On 29.09.24 04:19, Miaohe Lin wrote:
> On 2024/9/29 10:04, Kefeng Wang wrote:
>>
>>
>> On 2024/9/29 9:16, Miaohe Lin wrote:
>>> On 2024/9/28 16:39, David Hildenbrand wrote:
>>>> On 28.09.24 10:34, David Hildenbrand wrote:
>>>>> On 28.09.24 06:55, Matthew Wilcox wrote:
>>>>>> On Tue, Aug 27, 2024 at 07:47:24PM +0800, Kefeng Wang wrote:
>>>>>>> Directly use a folio for HugeTLB and THP when calculate the next pfn, then
>>>>>>> remove unused head variable.
>>>>>>
>>>>>> I just noticed this got merged.  You're going to hit BUG_ON with it.
>>>>>>
>>>>>>> -        if (PageHuge(page)) {
>>>>>>> -            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>>>>>> -            isolate_hugetlb(folio, &source);
>>>>>>> -            continue;
>>>>>>> -        } else if (PageTransHuge(page))
>>>>>>> -            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>>>>>>> +        /*
>>>>>>> +         * No reference or lock is held on the folio, so it might
>>>>>>> +         * be modified concurrently (e.g. split).  As such,
>>>>>>> +         * folio_nr_pages() may read garbage.  This is fine as the outer
>>>>>>> +         * loop will revisit the split folio later.
>>>>>>> +         */
>>>>>>> +        if (folio_test_large(folio)) {
>>>>>>
>>>>>> But it's not fine.  Look at the implementation of folio_test_large():
>>>>>>
>>>>>> static inline bool folio_test_large(const struct folio *folio)
>>>>>> {
>>>>>>             return folio_test_head(folio);
>>>>>> }
>>>>>>
>>>>>> That's going to be provided by:
>>>>>>
>>>>>> #define FOLIO_TEST_FLAG(name, page)                                     \
>>>>>> static __always_inline bool folio_test_##name(const struct folio *folio) \
>>>>>> { return test_bit(PG_##name, const_folio_flags(folio, page)); }
>>>>>>
>>>>>> and here's the BUG:
>>>>>>
>>>>>> static const unsigned long *const_folio_flags(const struct folio *folio,
>>>>>>                     unsigned n)
>>>>>> {
>>>>>>             const struct page *page = &folio->page;
>>>>>>
>>>>>>             VM_BUG_ON_PGFLAGS(PageTail(page), page);
>>>>>>             VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
>>>>>>             return &page[n].flags;
>>>>>> }
>>>>>>
>>>>>> (this page can be transformed from a head page to a tail page because,
>>>>>> as the comment notes, we don't hold a reference.
>>>>>>
>>>>>> Please back this out.
>>>>>
>>>>> Should we generalize the approach in dump_folio() to locally copy a
>>>>> folio, so we can safely perform checks before deciding whether we want
>>>>> to try grabbing a reference on the real folio (if it's still a folio :) )?
>>>>>
>>>>
>>>> Oh, and I forgot: isn't the existing code already racy?
>>>>
>>>> PageTransHuge() -> VM_BUG_ON_PAGE(PageTail(page), page);
>>
>> Yes, in v1[1], I asked same question for existing code for PageTransHuge(page),
>>
>>    "If the page is a tail page, we will BUG_ON(DEBUG_VM enabled) here,
>>     but it seems that we don't guarantee the page won't be a tail page."
>>
>>
>> we could delay the calculation after we got a ref, but the traversal of pfn may slow down a little if hint a tail pfn, is it acceptable?
>>
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1786,15 +1786,6 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>                  page = pfn_to_page(pfn);
>>                  folio = page_folio(page);
>>
>> -               /*
>> -                * No reference or lock is held on the folio, so it might
>> -                * be modified concurrently (e.g. split).  As such,
>> -                * folio_nr_pages() may read garbage.  This is fine as the outer
>> -                * loop will revisit the split folio later.
>> -                */
>> -               if (folio_test_large(folio))
>> -                       pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>> -
>>                  /*
>>                   * HWPoison pages have elevated reference counts so the migration would
>>                   * fail on them. It also doesn't make any sense to migrate them in the
>> @@ -1807,6 +1798,8 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>                                  folio_isolate_lru(folio);
>>                          if (folio_mapped(folio))
>>                                  unmap_poisoned_folio(folio, TTU_IGNORE_MLOCK);
>> +                       if (folio_test_large(folio))
>> +                               pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>                          continue;
>>                  }
>>
>> @@ -1823,6 +1816,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>                                  dump_page(page, "isolation failed");
>>                          }
>>                  }
>> +
>> +               if (folio_test_large(folio))
>> +                       pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>   put_folio:
>>                  folio_put(folio);
>>          }
>>
>>
>>>
>>> do_migrate_range is called after start_isolate_page_range(). So a page might not be able to
>>> transform from a head page to a tail page as it's isolated?
>> start_isolate_page_range() is only isolate free pages, so maybe irrelevant.
> 
> A page transform from a head page to a tail page should through the below steps:
> 1. The compound page is freed into buddy.
> 2. It's merged into larger order in buddy.
> 3. It's allocated as a larger order compound page.
> 
> Since it is isolated, I think step 2 or 3 cannot happen. Or am I miss something?

By isolated, you mean that the pageblock is isolated, and all free pages 
are in the MIGRATE_ISOLATE buddy list. Nice observation.

Indeed, a tail page could become a head page (concurrent split is 
possible), but a head page should not become a tail for the reason you 
mention.

Even mm/page_reporting.c will skip isolated pageblocks.

I wonder if there are some corner cases, but nothing comes to mind that 
would perform compound allocations from the MIGRATE_ISOLATE list.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-09-30  9:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 11:47 [PATCH v3 0/5] mm: memory_hotplug: improve do_migrate_range() Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 1/5] mm: memory_hotplug: remove head variable in do_migrate_range() Kefeng Wang
2024-09-28  4:55   ` Matthew Wilcox
2024-09-28  8:34     ` David Hildenbrand
2024-09-28  8:39       ` David Hildenbrand
2024-09-29  1:16         ` Miaohe Lin
2024-09-29  2:04           ` Kefeng Wang
2024-09-29  2:19             ` Miaohe Lin
2024-09-30  9:25               ` David Hildenbrand [this message]
2024-10-09  7:27                 ` Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 2/5] mm: memory-failure: add unmap_poisoned_folio() Kefeng Wang
2024-08-31  8:16   ` Miaohe Lin
2024-08-27 11:47 ` [PATCH v3 3/5] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range() Kefeng Wang
2024-08-31  8:36   ` Miaohe Lin
2024-08-27 11:47 ` [PATCH v3 4/5] mm: migrate: add isolate_folio_to_list() Kefeng Wang
2024-08-27 11:47 ` [PATCH v3 5/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation Kefeng Wang
2024-08-29 15:05   ` [PATCH v3 5-fix/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation fix Kefeng Wang
2024-08-29 15:19     ` David Hildenbrand
2024-08-30  1:23       ` Kefeng Wang
2024-08-31  9:01   ` [PATCH v3 5/5] mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd98e0fc-949d-4850-89f5-de1b1c4995dd@redhat.com \
    --to=david@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.carpenter@linaro.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox