Re: [PATCH v2] mm/hugetlb_vmemmap: remap head page to newly allocated page

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Joao Martins <joao.m.martins@oracle.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
	Muchun Song <songmuchun@bytedance.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2] mm/hugetlb_vmemmap: remap head page to newly allocated page
Date: Wed, 9 Nov 2022 13:15:36 +0000	[thread overview]
Message-ID: <1dcf6d8c-dcb0-4f52-8fcc-9aa56bd0cdd6@oracle.com> (raw)
In-Reply-To: <A5CDFA33-D30F-4E58-AE84-01191C792D96@linux.dev>

On 09/11/2022 02:42, Muchun Song wrote:
>> On Nov 8, 2022, at 18:38, Joao Martins <joao.m.martins@oracle.com> wrote:
>> On 08/11/2022 09:13, Muchun Song wrote:
>>>> On Nov 7, 2022, at 23:39, Joao Martins <joao.m.martins@oracle.com> wrote:
>>>> + int nid = page_to_nid((struct page *)start);
>>>> + struct page *page = NULL;
>>>> +
>>>> + /*
>>>> + * Allocate a new head vmemmap page to avoid breaking a contiguous
>>>> + * block of struct page memory when freeing it back to page allocator
>>>> + * in free_vmemmap_page_list(). This will allow the likely contiguous
>>>> + * struct page backing memory to be kept contiguous and allowing for
>>>> + * more allocations of hugepages. Fallback to the currently
>>>> + * mapped head page in case should it fail to allocate.
>>>> + */
>>>> + if (IS_ALIGNED((unsigned long)start, PAGE_SIZE))
>>>
>>> I'm curious why we need this check. IIUC, this is unnecessary.
>>>
>>
>> So if the start of the vmemmap range (the head page) we will remap isn't the
>> first struct page, then we would corrupt the other struct pages in
>> that vmemmap page unrelated to hugetlb? That was my thinking
> 
> Actually, @start address should be always aligned with PAGE_SIZE. If not,
> vmemmap_remap_range() will complain. So the check can be removed.
>

True, but it will be a VM_BUG_ON().

I can remove this check here, but I would suggest that
rather than crashing that we detect the error in the caller
(vmemmap_remap_free()) prior to proceeding.

>>
>>>> + page = alloc_pages_node(nid, gfp_mask, 0);
>>>> + walk.head_page = page;
>>>>
>>>> /*
>>>> * In order to make remapping routine most efficient for the huge pages,
>>>> -- 
>>>> 2.17.2
>>>>
>>>
>>> I have implemented a version based on yours, which does not introduce
>>> ->head_page field (Not test if it works). Seems to be simple.
>>>
>>
>> Let me try out with the adjustment below
>>
>>> Thanks.
>>>
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index c98805d5b815..8ee94f6a6697 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -202,12 +202,7 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end,
>>>                        return ret;
>>>        } while (pgd++, addr = next, addr != end);
>>>
>>> -       /*
>>> -        * We only change the mapping of the vmemmap virtual address range
>>> -        * [@start + PAGE_SIZE, end), so we only need to flush the TLB which
>>> -        * belongs to the range.
>>> -        */
>>> -       flush_tlb_kernel_range(start + PAGE_SIZE, end);
>>> +       flush_tlb_kernel_range(start, end);
>>>
>>>        return 0;
>>> }
>>> @@ -246,6 +241,12 @@ static void vmemmap_remap_pte(pte_t *pte, unsigned long addr,
>>>        pte_t entry = mk_pte(walk->reuse_page, pgprot);
>>>        struct page *page = pte_page(*pte);
>>>
>>> +       /* The case of remapping the head vmemmap page. */

I changed this comment to:

/* Remapping the head page requires r/w */

>>> +       if (unlikely(addr == walk->reuse_addr)) {
>>
>> You replace the head_page with checking the reuse_addr , but that is
>> set on the tail page. So if we want to rely on reuse_addr perhaps
>> best if do:
>>
>> if (unlikely(addr == (walk->reuse_addr - PAGE_SIZE))) {
>> ...
>> }
> 
> I don't think so. The @addr here should be equal to @walk->reuse_addr
> when vmemmap_remap_pte() is fist called since @addr does not be updated
> from vmemmap_pte_range(). Right?
> 

Ah, yes -- I misread it. I'm confusing with the 2 vmemmap pages reuse
rather the 1 (which is the latest, yes). And thus the reuse_addr is
pointed at the head page.

Ok, this should be much cleaner with these adjustments you proposed, and
works just as good as far as I tested

I'll respin v3.

     prev parent reply	other threads:[~2022-11-09 13:18 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-07 15:39 Joao Martins
2022-11-08  9:13 ` Muchun Song
2022-11-08 10:38   ` Joao Martins
2022-11-09  2:42     ` Muchun Song
2022-11-09 13:15       ` Joao Martins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1dcf6d8c-dcb0-4f52-8fcc-9aa56bd0cdd6@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox