Re: [PATCH 0/2] mm: Enable page parallel initialisation for Power

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Li Zhang <zhlcindy@gmail.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: akpm@linux-foundation.org, Vlastimil Babka <vbabka@suse.cz>,
	mgorman@techsingularity.net,
	Michael Ellerman <mpe@ellerman.id.au>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	aneesh.kumar@linux.vnet.ibm.com, linux-mm@kvack.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Zhang <zhlcindy@linux.vnet.ibm.com>
Subject: Re: [PATCH 0/2] mm: Enable page parallel initialisation for Power
Date: Wed, 9 Mar 2016 13:50:21 +0800	[thread overview]
Message-ID: <CAD8of+o8u_vhvcO3EeL4a7jgmG1xL4yoXv9+dCK1s2c_6uJVww@mail.gmail.com> (raw)
In-Reply-To: <56DFA66F.2020002@gmail.com>

On Wed, Mar 9, 2016 at 12:28 PM, Balbir Singh <bsingharora@gmail.com> wrote:
>
>
> On 09/03/16 15:17, Li Zhang wrote:
>> On Tue, Mar 8, 2016 at 10:45 PM, Balbir Singh <bsingharora@gmail.com> wrote:
>>>
>>> On 08/03/16 14:55, Li Zhang wrote:
>>>> From: Li Zhang <zhlcindy@linux.vnet.ibm.com>
>>>>
>>>> Uptream has supported page parallel initialisation for X86 and the
>>>> boot time is improved greately. Some tests have been done for Power.
>>>>
>>>> Here is the result I have done with different memory size.
>>>>
>>>> * 4GB memory:
>>>>     boot time is as the following:
>>>>     with patch vs without patch: 10.4s vs 24.5s
>>>>     boot time is improved 57%
>>>> * 200GB memory:
>>>>     boot time looks the same with and without patches.
>>>>     boot time is about 38s
>>>> * 32TB memory:
>>>>     boot time looks the same with and without patches
>>>>     boot time is about 160s.
>>>>     The boot time is much shorter than X86 with 24TB memory.
>>>>     From community discussion, it costs about 694s for X86 24T system.
>>>>
>>>> From code view, parallel initialisation improve the performance by
>>>> deferring memory initilisation to kswap with N kthreads, it should
>>>> improve the performance therotically.
>>>>
>>>> From the test result, On X86, performance is improved greatly with huge
>>>> memory. But on Power platform, it is improved greatly with less than
>>>> 100GB memory. For huge memory, it is not improved greatly. But it saves
>>>> the time with several threads at least, as the following information
>>>> shows(32TB system log):
>>>>
>>>> [   22.648169] node 9 initialised, 16607461 pages in 280ms
>>>> [   22.783772] node 3 initialised, 23937243 pages in 410ms
>>>> [   22.858877] node 6 initialised, 29179347 pages in 490ms
>>>> [   22.863252] node 2 initialised, 29179347 pages in 490ms
>>>> [   22.907545] node 0 initialised, 32049614 pages in 540ms
>>>> [   22.920891] node 15 initialised, 32212280 pages in 550ms
>>>> [   22.923236] node 4 initialised, 32306127 pages in 550ms
>>>> [   22.923384] node 12 initialised, 32314319 pages in 550ms
>>>> [   22.924754] node 8 initialised, 32314319 pages in 550ms
>>>> [   22.940780] node 13 initialised, 33353677 pages in 570ms
>>>> [   22.940796] node 11 initialised, 33353677 pages in 570ms
>>>> [   22.941700] node 5 initialised, 33353677 pages in 570ms
>>>> [   22.941721] node 10 initialised, 33353677 pages in 570ms
>>>> [   22.941876] node 7 initialised, 33353677 pages in 570ms
>>>> [   22.944946] node 14 initialised, 33353677 pages in 570ms
>>>> [   22.946063] node 1 initialised, 33345485 pages in 580ms
>>>>
>>>> It saves the time about 550*16 ms at least, although it can be ignore to compare
>>>> the boot time about 160 seconds. What's more, the boot time is much shorter
>>>> on Power even without patches than x86 for huge memory machine.
>>>>
>>>> So this patchset is still necessary to be enabled for Power.
>>>>
>>>>
>> Hi Balbir,
>>
>> Thanks for your reviewing.
>>
>>> The patchset looks good, two questions
>>>
>>> 1. The patchset is still necessary for
>>>     a. systems with smaller amount of RAM?
>>        I think it is. Currently, I tested systems for 4GB, 50GB, and
>> boot time is improved.
>>        We may test more systems with different memory size in the future.
>>>     b. Theoretically it improves boot time?
>>        The boot time is improved a little bit for huge memory system
>> and it can be ignored.
>>        But I think it's still necessary to enable this feature.
>>
>>> 2. the pgdat->node_spanned_pages >> 8 sounds arbitrary
>>>     On a system with 2TB*16 nodes, it would initialize about 8GB before calling deferred init?
>>>     Don't we need at-least 32GB + space for other early hash allocations
>>>     BTW, My expectation was that 32TB would imply 32GB+32GB of large hash allocations early on
>>       pgdat->node_spanned_pages >> 8 means that it allocates the size
>> of the memory on one node.
>>       On a system with 2TB *16nodes, it will allocate 16*8GB = 128GB.
>>       I am not sure if it can be minimised to >> 16 to make sure all
>> the architectures with different
>>       memory size work well.  And this is also mentioned in early
>> discussion for X86, so I choose  >> 8.
>>
>> *    From the code as the following:
>>
>>       free_area_init_core ->
>>                      memmap_init->
>>                               update_defer_init
>>      #define memmap_init(size, nid, zone, start_pfn) \
>>            memmap_init_zone((size), (nid), (zone), (start_pfn), MEMMAP_EARLY)
>>
>>      memmap_init_zone is based on a zone, but free_area_init_core will
>> help find the highest
>>      zone on the node. And update_defer_init() get max initialised
>> memory on highest zone for a node to
>>      reserve for early initialisation.
>>
>>      static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>>      {
>>             ...
>>            for (j = 0; j < MAX_NR_ZONES; j++) {
>>                   ....
>>                  memmap_init(size, nid, j, zone_start_fn);   //find
>> the highest zone on a node.
>>                  ...
>>            }
>>      }
>>
>> *   From the dmesg log, after applying this patchset, it has
>> 123013440K(about 117GB),
>>     which is enough for Dentry node hash table and Inode hash table in
>> this system.
>>
>>     [    0.000000] Memory: 123013440K/31739871232K available (8000K
>> kernel code, 1856K rwdata,
>>     3384K rodata, 6208K init, 2544K bss, 28531136K reserved, 0K cma-reserved)
>>
>> Thanks :)
>>
> Looks good! It seems the real benefit is for smaller systems - thanks for clarifying
> Please check if CMA is affected in any way
>

Sure, thanks.

> Acked-by: Balbir Singh <bsingharora@gmail.com>
>
> Balbir Singh.



-- 

Best Regards
-Li

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2016-03-09  5:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-08  3:55 Li Zhang
2016-03-08  3:55 ` [PATCH 1/2] mm: meminit: initialise more memory for inode/dentry hash tables in early boot Li Zhang
2016-03-08 13:25   ` Vlastimil Babka
2016-03-08  3:55 ` [PATCH 2/2] powerpc/mm: Enable page parallel initialisation Li Zhang
2016-03-08  9:36   ` Michael Ellerman
2016-03-09  2:06     ` Li Zhang
2016-03-09 21:42     ` Andrew Morton
2016-03-10  0:28       ` Michael Ellerman
2016-03-08 14:45 ` [PATCH 0/2] mm: Enable page parallel initialisation for Power Balbir Singh
2016-03-09  4:17   ` Li Zhang
2016-03-09  4:28     ` Balbir Singh
2016-03-09  5:50       ` Li Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAD8of+o8u_vhvcO3EeL4a7jgmG1xL4yoXv9+dCK1s2c_6uJVww@mail.gmail.com \
    --to=zhlcindy@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@techsingularity.net \
    --cc=mpe@ellerman.id.au \
    --cc=vbabka@suse.cz \
    --cc=zhlcindy@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox