From: Paul Menzel <pmenzel+linux-mm@molgen.mpg.de>
To: linux-mm@kvack.org
Cc: x86@kernel.org, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: How to profile 160 ms spent in `add_highpages_with_active_regions()`?
Date: Tue, 21 Aug 2018 11:37:57 +0200 [thread overview]
Message-ID: <5e5a39f4-1b91-c877-1368-0946160ef4be@molgen.mpg.de> (raw)
In-Reply-To: <d5a65984-36a7-15d8-b04a-461d0f53d36d@molgen.mpg.de>
[-- Attachment #1: Type: text/plain, Size: 9718 bytes --]
[Removed non-working Pavel Tatashin <pasha.tatashin@oracle.com>]
Dear Linux folks,
On 08/17/18 10:12, Paul Menzel wrote:
> With the merge of branch 'x86-timers-for-linus'
> of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip early time
> stamps are now printed.
>
>> Early TSC based time stamping to allow better boot time analysis.
>> This comes with a general cleanup of the TSC calibration code which
>> grew warts and duct taping over the years and removes 250 lines of
>> code. Initiated and mostly implemented by Pavel with help from various
>> folks
>
> Looking at those early time stamps, in this case on an ASRock E350M1,
> there is a 160 ms delay in the code below.
>
> Before:
>
> ```
> [ 0.000000] Initializing CPU#0
> [ 0.000000] Initializing HighMem for node 0 (000373fe:000c7d3c)
> [ 0.000000] Initializing Movable for node 0 (00000000:00000000)
> [ 0.000000] Memory: 3225668K/3273580K available (8898K kernel code, 747K rwdata, 2808K rodata, 768K init, 628K bss, 47912K reserved, 0K cma-reserved, 2368760K highmem)
> ```
>
> After:
>
> ```
> [ 0.063473] Initializing CPU#0
> [ 0.063484] Initializing HighMem for node 0 (00036ffe:000c7d3c)
> [ 0.229442] Initializing Movable for node 0 (00000000:00000000)
> [ 0.236020] Memory: 3225728K/3273580K available (8966K kernel code, 750K rwdata, 2828K rodata, 776K init, 640K bss, 47852K reserved, 0K cma-reserved, 2372856K highmem)
> ```
>
> The code in question is from `arch/x86/mm/highmem_32.c`.
>
>> void __init set_highmem_pages_init(void)
>> {
>> struct zone *zone;
>> int nid;
>>
>> /*
>> * Explicitly reset zone->managed_pages because set_highmem_pages_init()
>> * is invoked before free_all_bootmem()
>> */
>> reset_all_zones_managed_pages();
>> for_each_zone(zone) {
>> unsigned long zone_start_pfn, zone_end_pfn;
>>
>> if (!is_highmem(zone))
>> continue;
>>
>> zone_start_pfn = zone->zone_start_pfn;
>> zone_end_pfn = zone_start_pfn + zone->spanned_pages;
>>
>> nid = zone_to_nid(zone);
>> printk(KERN_INFO "Initializing %s for node %d (%08lx:%08lx)\n",
>> zone->name, nid, zone_start_pfn, zone_end_pfn);
>>
>> add_highpages_with_active_regions(nid, zone_start_pfn,
>> zone_end_pfn);
>> }
>> }
>
> And in there, it seems to be the function below.
>
>> void __init add_highpages_with_active_regions(int nid,
>> unsigned long start_pfn, unsigned long end_pfn)
>> {
>> phys_addr_t start, end;
>> u64 i;
>>
>> for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &start, &end, NULL) {
>> unsigned long pfn = clamp_t(unsigned long, PFN_UP(start),
>> start_pfn, end_pfn);
>> unsigned long e_pfn = clamp_t(unsigned long, PFN_DOWN(end),
>> start_pfn, end_pfn);
>> for ( ; pfn < e_pfn; pfn++)
>> if (pfn_valid(pfn))
>> free_highmem_page(pfn_to_page(pfn));
>
> Assuming the time stamps are correct, how can a profile that delay
> without adding print statements all over the place? Using ftrace
> doesn’t seem to work for me, probably because it’s that early.
I added print statements, and got the result below.
```
[ 0.057109] Initializing CPU#0
[ 0.057119] Initializing HighMem for node 0 (00036ffe:000c7d3c)
[ 0.057122] add_highpages_with_active_regions: for_each_free_mem_range: i = 0
[ 0.057124] add_highpages_with_active_regions: after for loop
[ 0.057126] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057127] add_highpages_with_active_regions: after for loop
[ 0.057128] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057129] add_highpages_with_active_regions: after for loop
[ 0.057131] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057132] add_highpages_with_active_regions: after for loop
[ 0.057133] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057134] add_highpages_with_active_regions: after for loop
[ 0.057136] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057137] add_highpages_with_active_regions: after for loop
[ 0.057138] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057140] add_highpages_with_active_regions: after for loop
[ 0.057141] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057142] add_highpages_with_active_regions: after for loop
[ 0.057143] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057145] add_highpages_with_active_regions: after for loop
[ 0.057146] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057147] add_highpages_with_active_regions: after for loop
[ 0.057149] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057150] add_highpages_with_active_regions: after for loop
[ 0.057151] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057152] add_highpages_with_active_regions: after for loop
[ 0.057154] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057155] add_highpages_with_active_regions: after for loop
[ 0.057156] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057157] add_highpages_with_active_regions: after for loop
[ 0.057159] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057160] add_highpages_with_active_regions: after for loop
[ 0.057161] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057163] add_highpages_with_active_regions: after for loop
[ 0.057164] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.057165] add_highpages_with_active_regions: after for loop
[ 0.057167] add_highpages_with_active_regions: for_each_free_mem_range: i = 2
[ 0.222580] add_highpages_with_active_regions: after for loop
[ 0.222583] Initializing Movable for node 0 (00000000:00000000)
[ 0.222585] add_highpages_with_active_regions: for_each_free_mem_range: i = 0
[ 0.222587] add_highpages_with_active_regions: after for loop
[ 0.222588] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222589] add_highpages_with_active_regions: after for loop
[ 0.222591] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222592] add_highpages_with_active_regions: after for loop
[ 0.222593] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222595] add_highpages_with_active_regions: after for loop
[ 0.222596] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222597] add_highpages_with_active_regions: after for loop
[ 0.222599] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222600] add_highpages_with_active_regions: after for loop
[ 0.222601] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222602] add_highpages_with_active_regions: after for loop
[ 0.222604] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222605] add_highpages_with_active_regions: after for loop
[ 0.222606] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222607] add_highpages_with_active_regions: after for loop
[ 0.222609] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222610] add_highpages_with_active_regions: after for loop
[ 0.222611] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222613] add_highpages_with_active_regions: after for loop
[ 0.222614] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222615] add_highpages_with_active_regions: after for loop
[ 0.222616] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222618] add_highpages_with_active_regions: after for loop
[ 0.222619] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222620] add_highpages_with_active_regions: after for loop
[ 0.222622] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222623] add_highpages_with_active_regions: after for loop
[ 0.222624] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222625] add_highpages_with_active_regions: after for loop
[ 0.222627] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[ 0.222628] add_highpages_with_active_regions: after for loop
[ 0.222629] add_highpages_with_active_regions: for_each_free_mem_range: i = 2
[ 0.222630] add_highpages_with_active_regions: after for loop
```
So, the problem is with i = 2.
Kind regards,
Paul
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]
next prev parent reply other threads:[~2018-08-21 9:38 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-17 8:12 Paul Menzel
2018-08-21 9:37 ` Paul Menzel [this message]
2018-08-22 13:25 ` Paul Menzel
2018-08-22 13:41 ` Thomas Gleixner
2018-09-04 9:16 ` Paul Menzel
2018-09-13 12:04 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5e5a39f4-1b91-c877-1368-0946160ef4be@molgen.mpg.de \
--to=pmenzel+linux-mm@molgen.mpg.de \
--cc=linux-mm@kvack.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox