linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Ingo Molnar <mingo@redhat.com>,
	x86@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Borislav Petkov <bp@suse.de>, Andi Kleen <ak@linux.intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv4 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
Date: Tue,  5 Dec 2017 16:59:42 +0300	[thread overview]
Message-ID: <20171205135942.24634-5-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <20171205135942.24634-1-kirill.shutemov@linux.intel.com>

This patch addresses a shortcoming in current boot process on machines
that supports 5-level paging.

If a bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

But if the bootloader put the kernel above 4G (not sure if anybody does
this), we would loose control as soon as paging is disabled as code
becomes unreachable.

This patch implements a trampoline in lower memory to handle this
situation.

We only need the memory for very short time, until main kernel image
would setup its own page tables.

We go through trampoline even if we don't have to: if we're already in
5-level paging mode or if we don't need to switch to it. This way the
trampoline gets tested on every boot.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S | 100 ++++++++++++++++++++++++-------------
 1 file changed, 66 insertions(+), 34 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 92d47b1bf10a..b1a6750805aa 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
 #include <asm/processor-flags.h>
 #include <asm/asm-offsets.h>
 #include <asm/bootparam.h>
+#include "pgtable.h"
 
 /*
  * Locally defined symbols should be marked hidden:
@@ -306,11 +307,23 @@ ENTRY(startup_64)
 
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
-	 * but we want to enable 5-level paging.
+	 * but we might want to enable 5-level paging.
 	 *
-	 * The problem is that we cannot do it directly. Setting LA57 in
-	 * long mode would trigger #GP. So we need to switch off long mode
-	 * first.
+	 * The problem is that we cannot do it directly. Setting CR.LA57
+	 * in the long mode would trigger #GP. So we need to switch off
+	 * long mode and paging first.
+	 *
+	 * We also need a trampoline in lower memory to switch over from
+	 * 4- to 5-level paging for cases when bootloader put kernel above
+	 * 4G, but didn't enable 5-level paging for us.
+	 *
+	 * For the trampoline, we need top page table in lower memory as
+	 * we don't have a way to load 64-bit value into CR3 from 32-bit
+	 * mode.
+	 *
+	 * We go though the trampoline even if we don't have to: if we're
+	 * already in 5-level paging mode or if we don't need to switch to
+	 * it. This way the trampoline code gets tested on every boot.
 	 */
 
 	/*
@@ -329,31 +342,22 @@ ENTRY(startup_64)
 
 	/* Save trampoline address in RCX */
 	movq	%rax, %rcx
-	andq	$~1, %rcx
-
-	testq	$1, %rax
-	jz	lvl5
-
-	/* Clear additional page table */
-	leaq	lvl5_pgtable(%rbx), %rdi
-	xorq	%rax, %rax
-	movq	$(PAGE_SIZE/8), %rcx
-	rep	stosq
 
 	/*
-	 * Setup current CR3 as the first and only entry in a new top level
-	 * page table.
+	 * Load address of trampoline_return into RDI.
+	 * It will be used by trampoline to return to main code.
 	 */
-	movq	%cr3, %rdi
-	leaq	0x7 (%rdi), %rax
-	movq	%rax, lvl5_pgtable(%rbx)
+	leaq	trampoline_return(%rip), %rdi
 
 	/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
 	pushq	$__KERNEL32_CS
-	leaq	compatible_mode(%rip), %rax
+	andq	$~1, %rax /* Clear bit 0: encodes if 5-level paging needed */
+	leaq	TRAMPOLINE_32BIT_CODE_OFF(%rax), %rax
 	pushq	%rax
 	lretq
-lvl5:
+trampoline_return:
+	/* Restore stack, 32-bit trampoline uses own stack */
+	leaq	boot_stack_end(%rbx), %rsp
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -491,36 +495,53 @@ relocated:
 	jmp	*%rax
 
 	.code32
+/*
+ * This is 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains return address (might be above 4G).
+ * ECX contains the base address of trampoline memory.
+ * Bit 0 of ECX encodes if 5-level paging is required.
+ */
 ENTRY(trampoline_32bit_src)
-compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
 	movl	%eax, %ds
 	movl	%eax, %ss
 
+	/* Save base address of trampoline in EDX, clearing bit 0 */
+	movl	%ecx, %edx
+	andl	$~1, %edx
+
+	/* Setup new stack */
+	leal	TRAMPOLINE_32BIT_STACK_END (%edx), %esp
+
 	/* Disable paging */
 	movl	%cr0, %eax
 	btrl	$X86_CR0_PG_BIT, %eax
 	movl	%eax, %cr0
 
-	/* Point CR3 to 5-level paging */
-	leal	lvl5_pgtable(%ebx), %eax
+	/* For 5-level paging, point CR3 to trampoline's new top level page table */
+	testl	$1, %ecx
+	jz	1f
+	leal	TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
 	movl	%eax, %cr3
+1:
 
-	/* Enable PAE and LA57 mode */
+	/* Enable PAE and LA57 (if required) modes */
 	movl	%cr4, %eax
-	orl	$(X86_CR4_PAE | X86_CR4_LA57), %eax
+	orl	$X86_CR4_PAE, %eax
+	testl	$1, %ecx
+	jz	1f
+	orl	$X86_CR4_LA57, %eax
+1:
 	movl	%eax, %cr4
 
-	/* Calculate address we are running at */
-	call	1f
-1:	popl	%edi
-	subl	$1b, %edi
+	/* Calculate address of paging_enabled once we are in trampoline */
+	leal	paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFF (%edx), %eax
 
 	/* Prepare stack for far return to Long Mode */
 	pushl	$__KERNEL_CS
-	leal	lvl5(%edi), %eax
-	push	%eax
+	pushl	%eax
 
 	/* Enable paging back */
 	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
@@ -528,6 +549,19 @@ compatible_mode:
 
 	lret
 
+	.code64
+paging_enabled:
+	/* Return from the trampoline */
+	jmp	*%rdi
+
+	/*
+	 * Bound size of trampoline code.
+	 * It would fail to compile if code of the trampoline would grow
+	 * beyond TRAMPOLINE_32BIT_CODE_SIZE bytes.
+	 */
+	.org	trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+	.code32
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
 1:
@@ -585,5 +619,3 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
-	.fill PAGE_SIZE, 1, 0
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-12-05 14:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-05 13:59 [PATCHv4 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-12-05 13:59 ` [PATCHv4 1/4] x86/boot/compressed/64: Fix build with GCC < 5 Kirill A. Shutemov
2017-12-07  6:16   ` Ingo Molnar
2017-12-05 13:59 ` [PATCHv4 2/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2017-12-07  6:17   ` Ingo Molnar
2017-12-05 13:59 ` [PATCHv4 3/4] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
2017-12-07  6:30   ` Ingo Molnar
2017-12-07  8:17     ` Matthew Wilcox
2017-12-07  8:24       ` Ingo Molnar
2017-12-08 11:07     ` Kirill A. Shutemov
2017-12-08 11:28       ` Ingo Molnar
2017-12-05 13:59 ` Kirill A. Shutemov [this message]
2017-12-07  7:03   ` [PATCHv4 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171205135942.24634-5-kirill.shutemov@linux.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=bp@suse.de \
    --cc=gorcunov@openvz.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox