linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, suzuki.poulose@arm.com,
	punit.agrawal@arm.com, will.deacon@arm.com, Steven.Price@arm.com,
	catalin.marinas@arm.com, mike.kravetz@oracle.com,
	n-horiguchi@ah.jp.nec.com
Subject: Re: [PATCH 1/4] mm/hugetlb: Enable PUD level huge page migration
Date: Wed, 3 Oct 2018 17:07:13 +0530	[thread overview]
Message-ID: <34b25855-fcef-61ed-312d-2011f80bdec4@arm.com> (raw)
In-Reply-To: <20181003105926.GA4714@dhcp22.suse.cz>



On 10/03/2018 04:29 PM, Michal Hocko wrote:
> On Wed 03-10-18 15:28:23, Anshuman Khandual wrote:
>>
>>
>> On 10/03/2018 12:28 PM, Michal Hocko wrote:
>>> On Wed 03-10-18 07:46:27, Anshuman Khandual wrote:
>>>>
>>>>
>>>> On 10/02/2018 06:09 PM, Michal Hocko wrote:
>>>>> On Tue 02-10-18 17:45:28, Anshuman Khandual wrote:
>>>>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs
>>>>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be
>>>>>> enabled for migration. It can be achieved through checking for PUD_SHIFT
>>>>>> order based HugeTLB pages during migration.
>>>>>
>>>>> Well a long term problem with hugepage_migration_supported is that it is
>>>>> used in two different context 1) to bail out from the migration early
>>>>> because the arch doesn't support migration at all and 2) to use movable
>>>>> zone for hugetlb pages allocation. I am especially concerned about the
>>>>> later because the mere support for migration is not really good enough.
>>>>> Are you really able to find a different giga page during the runtime to
>>>>> move an existing giga page out of the movable zone?
>>>>
>>>> I pre-allocate them before trying to initiate the migration (soft offline
>>>> in my experiments). Hence it should come from the pre-allocated HugeTLB
>>>> pool instead from the buddy. I might be missing something here but do we
>>>> ever allocate HugeTLB on the go when trying to migrate ? IIUC it always
>>>> came from the pool (unless its something related to ovecommit/surplus).
>>>> Could you please kindly explain regarding how migration target HugeTLB
>>>> pages are allocated on the fly from movable zone.
>>>
>>> Hotplug comes to mind. You usually do not pre-allocate to cover full
>>> node going offline. And people would like to do that. Another example is
>>> CMA. You would really like to move pages out of the way.
>>
>> You are right.
>>
>> Hotplug migration:
>>
>> __offline_pages
>>    do_migrate_range
>> 	migrate_pages(...new_node_page...)
>>
>> new_node_page
>>    new_page_nodemask
>> 	alloc_huge_page_nodemask
>> 	   dequeue_huge_page_nodemask (Getting from pool)
>> 	or
>> 	   alloc_migrate_huge_page    (Getting from buddy - non-gigantic)
>> 		alloc_fresh_huge_page
>> 		    alloc_buddy_huge_page
>> 			__alloc_pages_nodemask ----> goes into buddy
>>
>> CMA allocation:
>>
>> cma_alloc
>>    alloc_contig_range
>> 	__alloc_contig_migrate_range
>> 		migrate_pages(...alloc_migrate_target...)
>>
>> alloc_migrate_target
>>    new_page_nodemask -> __alloc_pages_nodemask ---> goes into buddy
>>
>> But this is not applicable for gigantic pages for which it backs off way
>> before going into buddy.
> 
> This is an implementation detail - mostly a missing or an incomplete
> hugetlb overcommit implementation IIRC. The primary point remains the
> same. Being able to migrate in principle and feasible enough to migrate
> to be placed in zone movable are two distinct things.

I agree. They are two distinct things.

> [...]
>>>> But even if there are some chances of run time allocation failure from
>>>> movable zone (as in point 2) that should not block the very initiation
>>>> of migration itself. IIUC thats not the semantics for either THP or
>>>> normal pages. Why should it be different here. If the allocation fails
>>>> we should report and abort as always. Its the caller of migration taking
>>>> the chances. why should we prevent that.
>>>
>>> Yes I agree, hence the distinction between the arch support for
>>> migrateability and the criterion on the movable zone placement.
>> movable zone placement sounds very tricky here. How can the platform
>> (through the hook huge_movable) before hand say whether destination
>> page could be allocated from the ZONE_MOVABLE without looking into the
>> state of buddy at migration (any sort attempt to do this is going to
>> be expensive) or it merely indicates the desire to live with possible
>> consequence (unable to hot unplug/CMA going forward) for a migration
>> which might end up in an unmovable area.
> 
> I do not follow. The whole point of zone_movable is to provide a
> physical memory range which is more or less movable. That means that
> pages allocated from this zone can be migrated away should there be a
> reason for that.

I understand this.

> 
>>>>> So I guess we want to split this into two functions
>>>>> arch_hugepage_migration_supported and hugepage_movable. The later would
>>>>
>>>> So the set difference between arch_hugepage_migration_supported and 
>>>> hugepage_movable still remains un-migratable ? Then what is the purpose
>>>> for arch_hugepage_migration_supported page size set in the first place.
>>>> Does it mean we allow the migration at the beginning and the abort later
>>>> when the page size does not fall within the subset for hugepage_movable.
>>>> Could you please kindly explain this in more detail.
>>>
>>> The purpose of arch_hugepage_migration_supported is to tell whether it
>>> makes any sense to even try to migration. The allocation placement is
>>
>> Which kind of matches what we have right now and being continued with this
>> proposal in the series.
> 
> Except you only go half way there. Because you still consider "able to
> migrate" and "feasible to migrate" as the same thing.

Okay.

> 
>>
>>> completely independent on this choice. The later just says whether it is
>>> feasible to place a hugepage to the zone movable. Sure regular 2MB pages
>>
>> What do you exactly mean by feasible ? Wont it depend on the state of the
>> buddy allocator (ZONE_MOVABLE in particular) and it's ability to accommodate
>> a given huge page. How can the platform decide on it ?
> 
> It is not the platform that decides. That is the whole point of the
> distinction. It is us to say what is feasible and what we want to
> support. Do we want to support giga pages in zone_movable? Under which
> conditions? See my point?

So huge_movable() is going to be a generic MM function deciding on the
feasibility for allocating a huge page of 'size' from movable zone during
migration. If the feasibility turns out to be negative, then migration
process is aborted there.

huge_movable() will do something like these:

- Return positive right away on smaller size huge pages
- Measure movable allocation feasibility for bigger huge pages
	- Look out for free_pages in the huge page order in movable areas
	- if (order > (MAX_ORDER - 1))
		- Scan the PFN ranges in movable zone for possible allocation
	- etc
	- etc

Did I get this right ?

> 
>> Or as I mentioned
>> before it's platform's willingness to live with unmovable huge pages (of
>> certain sizes) as a consequence of migration.
> 
> No, the platform has no saying in that. The platform only says that it
> supports migrating those pages in principle.
I understand this now.

  reply	other threads:[~2018-10-03 11:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-02 12:15 [PATCH 0/4] arm64/mm: Enable HugeTLB migration Anshuman Khandual
2018-10-02 12:15 ` [PATCH 1/4] mm/hugetlb: Enable PUD level huge page migration Anshuman Khandual
2018-10-02 12:38   ` Suzuki K Poulose
2018-10-02 12:56     ` Anshuman Khandual
2018-10-03 10:22       ` Suzuki K Poulose
2018-10-03 11:10         ` Anshuman Khandual
2018-10-03 11:17           ` Suzuki K Poulose
2018-10-03 11:27             ` Michal Hocko
2018-10-02 12:39   ` Michal Hocko
2018-10-03  2:16     ` Anshuman Khandual
2018-10-03  6:58       ` Michal Hocko
2018-10-03  9:58         ` Anshuman Khandual
2018-10-03 10:59           ` Michal Hocko
2018-10-03 11:37             ` Anshuman Khandual [this message]
2018-10-03 11:48               ` Michal Hocko
2018-10-03 13:06                 ` Anshuman Khandual
2018-10-03 13:36                   ` Michal Hocko
2018-10-05  7:34                     ` Anshuman Khandual
2018-10-09 14:14                       ` Michal Hocko
2018-10-10  3:09                         ` Anshuman Khandual
2018-10-10  9:39                           ` Michal Hocko
2018-10-11  3:16                             ` Anshuman Khandual
2018-10-02 12:15 ` [PATCH 2/4] mm/hugetlb: Enable arch specific huge page size support for migration Anshuman Khandual
2018-10-02 12:15 ` [PATCH 3/4] arm64/mm: Enable HugeTLB migration Anshuman Khandual
2018-10-02 12:15 ` [PATCH 4/4] arm64/mm: Enable HugeTLB migration for contiguous bit HugeTLB pages Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34b25855-fcef-61ed-312d-2011f80bdec4@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=Steven.Price@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=punit.agrawal@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox