linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pingfan Liu <kernelfans@gmail.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Len Brown <lenb@kernel.org>, Yinghai Lu <yinghai@kernel.org>,
	Tejun Heo <tj@kernel.org>, Chao Fan <fanc.fnst@cn.fujitsu.com>,
	Baoquan He <bhe@redhat.com>, Juergen Gross <jgross@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	x86@kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCHv2 0/7] x86_64/mm: remove bottom-up allocation style by pushing forward the parsing of mem hotplug info
Date: Tue, 15 Jan 2019 14:06:18 +0800	[thread overview]
Message-ID: <CAFgQCTtsw9xj3M85HU2GBk5iPSF4h_H43do-rfpXMo8svmgoJg@mail.gmail.com> (raw)
In-Reply-To: <fe88d6ff-00e1-b65d-f411-64b03227bd17@intel.com>

On Tue, Jan 15, 2019 at 7:02 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 1/10/19 9:12 PM, Pingfan Liu wrote:
> > Background
> > When kaslr kernel can be guaranteed to sit inside unmovable node
> > after [1].
>
> What does this "[1]" refer to?
>
https://lore.kernel.org/patchwork/patch/1029376/

> Also, can you clarify your terminology here a bit.  By "kaslr kernel",
> do you mean the base address?
>
It should be the randomization of load address. Googled, and found out
that it is "base address".

> > But if kaslr kernel is located near the end of the movable node,
> > then bottom-up allocator may create pagetable which crosses the boundary
> > between unmovable node and movable node.
>
> Again, I'm confused.  Do you literally mean a single page table page?  I
> think you mean the page tables, but it would be nice to clarify this,
> and also explicitly state which page tables these are.
>
It should be page table pages. The page table is built by init_mem_mapping().

> >  It is a probability issue,
> > two factors include -1. how big the gap between kernel end and
> > unmovable node's end.  -2. how many memory does the system own.
> > Alternative way to fix this issue is by increasing the gap by
> > boot/compressed/kaslr*.
>
> Oh, you mean the KASLR code in arch/x86/boot/compressed/kaslr*.[ch]?
>
Sorry, and yes, code in arch/x86/boot/compressed/kaslr_64.c and kaslr.c

> It took me a minute to figure out you were talking about filenames.
>
> > But taking the scenario of PB level memory, the pagetable will take
> > server MB even if using 1GB page, different page attr and fragment
> > will make things worse. So it is hard to decide how much should the
> > gap increase.
> I'm not following this.  If we move the image around, we leave holes.
> Why do we need page table pages allocated to cover these holes?
>
I means in arch/x86/boot/compressed/kaslr.c, store_slot_info() {
slot_area.num = (region->size - image_size) /CONFIG_PHYSICAL_ALIGN + 1
}.  Let us denote the size of page table as "X", then the formula is
changed to slot_area.num = (region->size - image_size -X)
/CONFIG_PHYSICAL_ALIGN + 1. And it is hard to decide X due to the
above factors.

> > The following figure show the defection of current bottom-up style:
> >   [startA, endA][startB, "kaslr kernel verly close to" endB][startC, endC]
>
> "defection"?
>
Oh, defect.

> > If nodeA,B is unmovable, while nodeC is movable, then init_mem_mapping()
> > can generate pgtable on nodeC, which stain movable node.
>
> Let me see if I can summarize this:
> 1. The kernel ASLR decompression code picks a spot to place the kernel
>    image in physical memory.
> 2. Some page tables are dynamically allocated near (after) this spot.
> 3. Sometimes, based on the random ASLR location, these page tables fall
>    over into the "movable node" area.  Being unmovable allocations, this
>    is not cool.
> 4. To fix this (on 64-bit at least), we stop allocating page tables
>    based on the location of the kernel image.  Instead, we allocate
>    using the memblock allocator itself, which knows how to avoid the
>    movable node.
>
Yes, you get my idea exactly. Thanks for your help to summary it. Hard
for me to express it clearly in English.

> > This patch makes it certainty instead of a probablity problem. It achieves
> > this by pushing forward the parsing of mem hotplug info ahead of init_mem_mapping().
>
> What does memory hotplug have to do with this?  I thought this was all
> about early boot.

Put the info about memory hot plugable to memblock allocator,
initmem_init()->...->acpi_numa_memory_affinity_init(), where
memblock_mark_hotplug() does it. Later when memory allocator works, in
__next_mem_range(), it will check this info by
memblock_is_hotpluggable().

Thanks and regards,
Pingfan

  reply	other threads:[~2019-01-15  6:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11  5:12 Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 1/7] x86/mm: concentrate the code to memblock allocator enabled Pingfan Liu
2019-01-11  6:12   ` Chao Fan
2019-01-11  6:12     ` Chao Fan
2019-01-11 10:06     ` Pingfan Liu
2019-01-11 10:06       ` Pingfan Liu
2019-01-14 23:07   ` Dave Hansen
2019-01-15  7:06     ` Pingfan Liu
2019-01-15  7:06       ` Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 2/7] acpi: change the topo of acpi_table_upgrade() Pingfan Liu
2019-01-11  5:30   ` Chao Fan
2019-01-11  5:30     ` Chao Fan
2019-01-11 10:08     ` Pingfan Liu
2019-01-11 10:08       ` Pingfan Liu
2019-01-14 23:12   ` Dave Hansen
2019-01-15  7:28     ` Pingfan Liu
2019-01-15  7:28       ` Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 3/7] mm/memblock: introduce allocation boundary for tracing purpose Pingfan Liu
2019-01-14  7:51   ` Mike Rapoport
2019-01-14  8:33     ` Pingfan Liu
2019-01-14  8:33       ` Pingfan Liu
2019-01-14  8:50       ` Mike Rapoport
2019-01-14  9:13         ` Pingfan Liu
2019-01-14  9:13           ` Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 4/7] x86/setup: parse acpi to get hotplug info before init_mem_mapping() Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 5/7] x86/mm: set allowed range for memblock allocator Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 6/7] x86/mm: remove bottom-up allocation style for x86_64 Pingfan Liu
2019-01-14 23:27   ` Dave Hansen
2019-01-15  7:38     ` Pingfan Liu
2019-01-15  7:38       ` Pingfan Liu
2019-01-11  5:12 ` [PATCHv2 7/7] x86/mm: isolate the bottom-up style to init_32.c Pingfan Liu
2019-01-14 23:02 ` [PATCHv2 0/7] x86_64/mm: remove bottom-up allocation style by pushing forward the parsing of mem hotplug info Dave Hansen
2019-01-15  6:06   ` Pingfan Liu [this message]
2019-01-15  6:06     ` Pingfan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFgQCTtsw9xj3M85HU2GBk5iPSF4h_H43do-rfpXMo8svmgoJg@mail.gmail.com \
    --to=kernelfans@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fanc.fnst@cn.fujitsu.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox