* [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code
@ 2018-01-23 17:09 Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:09 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
These patcheset is a preparation set for boot-time switching between
paging modes. Please apply.
The first patch is pure cosmetic change: it gives file with KASLR helpers
a proper name.
The last three patches bring support of booting into 5-level paging mode if
a bootloader put the kernel above 4G.
Patch 2/4 Renames l5_paging_required() into paging_prepare() and change
interface of the function.
Patch 3/4 Handles allocation of space for trampoline and gets it prepared.
Patch 4/4 Gets trampoline used.
Kirill A. Shutemov (4):
x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
x86/boot/compressed/64: Introduce paging_prepare()
x86/boot/compressed/64: Prepare trampoline memory
x86/boot/compressed/64: Handle 5-level paging boot if kernel is above
4G
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/head_64.S | 134 ++++++++++++---------
.../boot/compressed/{pagetable.c => kaslr_64.c} | 0
arch/x86/boot/compressed/pgtable.h | 18 +++
arch/x86/boot/compressed/pgtable_64.c | 65 ++++++++--
5 files changed, 153 insertions(+), 66 deletions(-)
rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
create mode 100644 arch/x86/boot/compressed/pgtable.h
--
2.15.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv6, RESEND 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
2018-01-23 17:09 [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
@ 2018-01-23 17:09 ` Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 2/4] x86/boot/compressed/64: Introduce paging_prepare() Kirill A. Shutemov
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:09 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
The name of the file -- pagetable.c -- is misleading: it only contains
helpers used for KASLR in 64-bit mode.
Let's rename the file to reflect its content.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} | 0
2 files changed, 1 insertion(+), 1 deletion(-)
rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f25e1530e064..1f734cd98fd3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -78,7 +78,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
ifdef CONFIG_X86_64
- vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
vmlinux-objs-y += $(obj)/mem_encrypt.o
vmlinux-objs-y += $(obj)/pgtable_64.o
endif
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/kaslr_64.c
similarity index 100%
rename from arch/x86/boot/compressed/pagetable.c
rename to arch/x86/boot/compressed/kaslr_64.c
--
2.15.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv6, RESEND 2/4] x86/boot/compressed/64: Introduce paging_prepare()
2018-01-23 17:09 [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
@ 2018-01-23 17:09 ` Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 3/4] x86/boot/compressed/64: Prepare trampoline memory Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
3 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:09 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
This patch renames l5_paging_required() into paging_prepare() and
changes the interface of the function.
This is a preparation for the next patch, which would make the function
also allocate memory for the 32-bit trampoline.
The function now returns a 128-bit structure. RAX would return
trampoline memory address (zero for now) and RDX would indicate if we
need to enabled 5-level paging.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/head_64.S | 41 ++++++++++++++++-------------------
arch/x86/boot/compressed/pgtable_64.c | 25 ++++++++++-----------
2 files changed, 31 insertions(+), 35 deletions(-)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fc313e29fe2c..10b4df46de84 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -304,20 +304,6 @@ ENTRY(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp
-#ifdef CONFIG_X86_5LEVEL
- /*
- * Check if we need to enable 5-level paging.
- * RSI holds real mode data and need to be preserved across
- * a function call.
- */
- pushq %rsi
- call l5_paging_required
- popq %rsi
-
- /* If l5_paging_required() returned zero, we're done here. */
- cmpq $0, %rax
- je lvl5
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we want to enable 5-level paging.
@@ -325,12 +311,28 @@ ENTRY(startup_64)
* The problem is that we cannot do it directly. Setting LA57 in
* long mode would trigger #GP. So we need to switch off long mode
* first.
+ */
+
+ /*
+ * paging_prepare() would set up the trampoline and check if we need to
+ * enable 5-level paging.
*
- * NOTE: This is not going to work if bootloader put us above 4G
- * limit.
+ * Address of the trampoline is returned in RAX.
+ * Non zero RDX on return means we need to enable 5-level paging.
*
- * The first step is go into compatibility mode.
+ * RSI holds real mode data and need to be preserved across
+ * a function call.
*/
+ pushq %rsi
+ call paging_prepare
+ popq %rsi
+
+ /* Save the trampoline address in RCX */
+ movq %rax, %rcx
+
+ /* Check if we need to enable 5-level paging */
+ cmpq $0, %rdx
+ jz lvl5
/* Clear additional page table */
leaq lvl5_pgtable(%rbx), %rdi
@@ -352,7 +354,6 @@ ENTRY(startup_64)
pushq %rax
lretq
lvl5:
-#endif
/* Zero EFLAGS */
pushq $0
@@ -490,7 +491,6 @@ relocated:
jmp *%rax
.code32
-#ifdef CONFIG_X86_5LEVEL
compatible_mode:
/* Setup data and stack segments */
movl $__KERNEL_DS, %eax
@@ -526,7 +526,6 @@ compatible_mode:
movl %eax, %cr0
lret
-#endif
no_longmode:
/* This isn't an x86-64 CPU so hang */
@@ -585,7 +584,5 @@ boot_stack_end:
.balign 4096
pgtable:
.fill BOOT_PGT_SIZE, 1, 0
-#ifdef CONFIG_X86_5LEVEL
lvl5_pgtable:
.fill PAGE_SIZE, 1, 0
-#endif
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index b4469a37e9a1..3f1697fcc7a8 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -9,20 +9,19 @@
*/
unsigned long __force_order;
-int l5_paging_required(void)
-{
- /* Check if leaf 7 is supported. */
-
- if (native_cpuid_eax(0) < 7)
- return 0;
+struct paging_config {
+ unsigned long trampoline_start;
+ unsigned long l5_required;
+};
- /* Check if la57 is supported. */
- if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
- return 0;
+struct paging_config paging_prepare(void)
+{
+ struct paging_config paging_config = {};
- /* Check if 5-level paging has already been enabled. */
- if (native_read_cr4() & X86_CR4_LA57)
- return 0;
+ /* Check if LA57 is desired and supported */
+ if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
+ (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+ paging_config.l5_required = 1;
- return 1;
+ return paging_config;
}
--
2.15.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv6, RESEND 3/4] x86/boot/compressed/64: Prepare trampoline memory
2018-01-23 17:09 [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 2/4] x86/boot/compressed/64: Introduce paging_prepare() Kirill A. Shutemov
@ 2018-01-23 17:09 ` Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
3 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:09 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
If a bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires the disabling
paging. It works fine if kernel itself is loaded below 4G.
But if the bootloader put the kernel above 4G (not sure if anybody does
this), we would lose control as soon as paging is disabled, because the
code becomes unreachable to the CPU.
To handle the situation, we need a trampoline in lower memory that would
take care of switching on 5-level paging.
Apart from the trampoline code itself we also need a place to store top
level page table in lower memory as we don't have a way to load 64-bit
values into CR3 in 32-bit mode. We only really need 8 bytes there as we
only use the very first entry of the page table. But we allocate a whole
page anyway.
We cannot have the code in the same page as the page table because there's
a risk that a CPU would read the page table speculatively and get confused
by seeing garbage. It's never a good idea to have junk in PTE entries
visible to the CPU.
We also need a small stack in the trampoline to re-enable long mode via
long return. But stack and code can share the page just fine.
This patch changes paging_prepare() to find a right spot in lower memory
for the trampoline. Then it copies the trampoline code there and sets up
the new top level page table for 5-level paging.
At this point we do all the preparation, but don't use trampoline yet.
It will be done in the following patch.
The trampoline will be used even on 4-level paging machines. This way we
will get better test coverage and the keep the trampoline code in shape.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/head_64.S | 3 ++-
arch/x86/boot/compressed/pgtable.h | 18 ++++++++++++++
arch/x86/boot/compressed/pgtable_64.c | 44 +++++++++++++++++++++++++++++++++++
3 files changed, 64 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/boot/compressed/pgtable.h
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 10b4df46de84..1bcc62a232f6 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -491,8 +491,9 @@ relocated:
jmp *%rax
.code32
+ENTRY(trampoline_32bit_src)
compatible_mode:
- /* Setup data and stack segments */
+ /* Set up data and stack segments */
movl $__KERNEL_DS, %eax
movl %eax, %ds
movl %eax, %ss
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
new file mode 100644
index 000000000000..6e0db2260147
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -0,0 +1,18 @@
+#ifndef BOOT_COMPRESSED_PAGETABLE_H
+#define BOOT_COMPRESSED_PAGETABLE_H
+
+#define TRAMPOLINE_32BIT_SIZE (2 * PAGE_SIZE)
+
+#define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0
+
+#define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE
+#define TRAMPOLINE_32BIT_CODE_SIZE 0x60
+
+#define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE
+
+#ifndef __ASSEMBLER__
+
+extern void (*trampoline_32bit_src)(void *return_ptr);
+
+#endif /* __ASSEMBLER__ */
+#endif /* BOOT_COMPRESSED_PAGETABLE_H */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 3f1697fcc7a8..c8f9e93598d5 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -1,4 +1,6 @@
#include <asm/processor.h>
+#include "pgtable.h"
+#include "../string.h"
/*
* __force_order is used by special_insns.h asm code to force instruction
@@ -9,6 +11,9 @@
*/
unsigned long __force_order;
+#define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */
+#define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */
+
struct paging_config {
unsigned long trampoline_start;
unsigned long l5_required;
@@ -17,11 +22,50 @@ struct paging_config {
struct paging_config paging_prepare(void)
{
struct paging_config paging_config = {};
+ unsigned long bios_start, ebda_start, *trampoline;
/* Check if LA57 is desired and supported */
if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
paging_config.l5_required = 1;
+ /*
+ * Find a suitable spot for the trampoline.
+ * This code is based on reserve_bios_regions().
+ */
+
+ ebda_start = *(unsigned short *)0x40e << 4;
+ bios_start = *(unsigned short *)0x413 << 10;
+
+ if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
+ bios_start = BIOS_START_MAX;
+
+ if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
+ bios_start = ebda_start;
+
+ /* Place the trampoline just below the end of low memory, aligned to 4k */
+ paging_config.trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
+ paging_config.trampoline_start = round_down(paging_config.trampoline_start, PAGE_SIZE);
+
+ trampoline = (unsigned long *)paging_config.trampoline_start;
+
+ /* Clear trampoline memory first */
+ memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
+
+ /* Copy trampoline code in place */
+ memcpy(trampoline + TRAMPOLINE_32BIT_CODE_OFFSET / sizeof(unsigned long),
+ &trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE);
+
+ /*
+ * For 5-level paging, set up current CR3 as the first and
+ * the only entry in a new top level page table.
+ *
+ * For 4-level paging, trampoline wouldn't touch CR3.
+ * KASLR relies on CR3 pointing to _pgtable.
+ * See initialize_identity_maps().
+ */
+ if (paging_config.l5_required)
+ trampoline[TRAMPOLINE_32BIT_PGTABLE_OFFSET] = __native_read_cr3() + _PAGE_TABLE_NOENC;
+
return paging_config;
}
--
2.15.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
2018-01-23 17:09 [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
` (2 preceding siblings ...)
2018-01-23 17:09 ` [PATCHv6, RESEND 3/4] x86/boot/compressed/64: Prepare trampoline memory Kirill A. Shutemov
@ 2018-01-23 17:09 ` Kirill A. Shutemov
2018-01-23 17:31 ` Linus Torvalds
3 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:09 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
This patch addresses a shortcoming in current boot process on machines
that supports 5-level paging.
If a bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires the disabling
paging. It works fine if kernel itself is loaded below 4G.
But if the bootloader put the kernel above 4G (not sure if anybody does
this), we would lose control as soon as paging is disabled, because the
code becomes unreachable to the CPU.
This patch implements a trampoline in lower memory to handle this
situation.
We only need the memory for a very short time, until the main kernel
image sets up own page tables.
We go through the trampoline even if we don't have to: if we're already
in 5-level paging mode or if we don't need to switch to it. This way the
trampoline gets tested on every boot.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/head_64.S | 102 +++++++++++++++++++++++--------------
1 file changed, 65 insertions(+), 37 deletions(-)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 1bcc62a232f6..fa45e801e132 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
#include <asm/processor-flags.h>
#include <asm/asm-offsets.h>
#include <asm/bootparam.h>
+#include "pgtable.h"
/*
* Locally defined symbols should be marked hidden:
@@ -306,11 +307,23 @@ ENTRY(startup_64)
/*
* At this point we are in long mode with 4-level paging enabled,
- * but we want to enable 5-level paging.
+ * but we might want to enable 5-level paging.
*
- * The problem is that we cannot do it directly. Setting LA57 in
- * long mode would trigger #GP. So we need to switch off long mode
- * first.
+ * The problem is that we cannot do it directly. Setting CR4.LA57 in
+ * long mode would trigger #GP. So we need to switch off long mode and
+ * paging first.
+ *
+ * We also need a trampoline in lower memory to switch over from
+ * 4- to 5-level paging for cases when the bootloader puts the kernel
+ * above 4G, but didn't enable 5-level paging for us.
+ *
+ * For the trampoline, we need the top page table to reside in lower
+ * memory as we don't have a way to load 64-bit values into CR3 in
+ * 32-bit mode.
+ *
+ * We go though the trampoline even if we don't have to: if we're
+ * already in 5-level paging mode or if we don't need to switch to
+ * it. This way the trampoline code gets tested on every boot.
*/
/*
@@ -330,30 +343,21 @@ ENTRY(startup_64)
/* Save the trampoline address in RCX */
movq %rax, %rcx
- /* Check if we need to enable 5-level paging */
- cmpq $0, %rdx
- jz lvl5
-
- /* Clear additional page table */
- leaq lvl5_pgtable(%rbx), %rdi
- xorq %rax, %rax
- movq $(PAGE_SIZE/8), %rcx
- rep stosq
-
/*
- * Setup current CR3 as the first and only entry in a new top level
- * page table.
+ * Load the address of trampoline_return() into RDI.
+ * It will be used by the trampoline to return to the main code.
*/
- movq %cr3, %rdi
- leaq 0x7 (%rdi), %rax
- movq %rax, lvl5_pgtable(%rbx)
+ leaq trampoline_return(%rip), %rdi
/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
pushq $__KERNEL32_CS
- leaq compatible_mode(%rip), %rax
+ andq $~1, %rax /* Clear bit 0: encodes if 5-level paging needed */
+ leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax
pushq %rax
lretq
-lvl5:
+trampoline_return:
+ /* Restore the stack, the 32-bit trampoline uses its own stack */
+ leaq boot_stack_end(%rbx), %rsp
/* Zero EFLAGS */
pushq $0
@@ -491,45 +495,71 @@ relocated:
jmp *%rax
.code32
+/*
+ * This is the 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains the return address (might be above 4G).
+ * ECX contains the base address of the trampoline memory.
+ * Non zero RDX on return means we need to enable 5-level paging.
+ */
ENTRY(trampoline_32bit_src)
-compatible_mode:
/* Set up data and stack segments */
movl $__KERNEL_DS, %eax
movl %eax, %ds
movl %eax, %ss
+ /* Setup new stack */
+ leal TRAMPOLINE_32BIT_STACK_END(%ecx), %esp
+
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax
movl %eax, %cr0
- /* Point CR3 to 5-level paging */
- leal lvl5_pgtable(%ebx), %eax
+ /* For 5-level paging, point CR3 to the trampoline's new top level page table */
+ cmpl $0, %edx
+ jz 1f
+ leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%ecx), %eax
movl %eax, %cr3
+1:
- /* Enable PAE and LA57 mode */
+ /* Enable PAE and LA57 (if required) paging modes */
movl %cr4, %eax
- orl $(X86_CR4_PAE | X86_CR4_LA57), %eax
+ orl $X86_CR4_PAE, %eax
+ cmpl $0, %edx
+ jz 1f
+ orl $X86_CR4_LA57, %eax
+1:
movl %eax, %cr4
- /* Calculate address we are running at */
- call 1f
-1: popl %edi
- subl $1b, %edi
+ /* Calculate address of paging_enabled() once we are executing in the trampoline */
+ leal paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFFSET(%ecx), %eax
- /* Prepare stack for far return to Long Mode */
+ /* Prepare the stack for far return to Long Mode */
pushl $__KERNEL_CS
- leal lvl5(%edi), %eax
- push %eax
+ pushl %eax
- /* Enable paging back */
+ /* Enable paging again */
movl $(X86_CR0_PG | X86_CR0_PE), %eax
movl %eax, %cr0
lret
+ .code64
+paging_enabled:
+ /* Return from the trampoline */
+ jmp *%rdi
+
+ /*
+ * The trampoline code has a size limit.
+ * Make sure we fail to compile if the trampoline code grows
+ * beyond TRAMPOLINE_32BIT_CODE_SIZE bytes.
+ */
+ .org trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+ .code32
no_longmode:
- /* This isn't an x86-64 CPU so hang */
+ /* This isn't an x86-64 CPU, so hang intentionally, we cannot continue */
1:
hlt
jmp 1b
@@ -585,5 +615,3 @@ boot_stack_end:
.balign 4096
pgtable:
.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
- .fill PAGE_SIZE, 1, 0
--
2.15.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
2018-01-23 17:09 ` [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
@ 2018-01-23 17:31 ` Linus Torvalds
2018-01-23 17:37 ` Kirill A. Shutemov
0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2018-01-23 17:31 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, the arch/x86 maintainers, Thomas Gleixner,
H. Peter Anvin, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, Linux Kernel Mailing List
On Tue, Jan 23, 2018 at 9:09 AM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> But if the bootloader put the kernel above 4G (not sure if anybody does
> this), we would lose control as soon as paging is disabled, because the
> code becomes unreachable to the CPU.
I do wonder if we need this. Why would a bootloader ever put the data
above 4G? Does this really happen? Wouldn't it be easier to just say
"bootloaders better put the kernel in the low 4G"?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
2018-01-23 17:31 ` Linus Torvalds
@ 2018-01-23 17:37 ` Kirill A. Shutemov
2018-01-23 18:13 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2018-01-23 17:37 UTC (permalink / raw)
To: Linus Torvalds
Cc: Kirill A. Shutemov, Ingo Molnar, the arch/x86 maintainers,
Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
Cyrill Gorcunov, Borislav Petkov, Andi Kleen, linux-mm,
Linux Kernel Mailing List
On Tue, Jan 23, 2018 at 09:31:16AM -0800, Linus Torvalds wrote:
> On Tue, Jan 23, 2018 at 9:09 AM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > But if the bootloader put the kernel above 4G (not sure if anybody does
> > this), we would lose control as soon as paging is disabled, because the
> > code becomes unreachable to the CPU.
>
> I do wonder if we need this. Why would a bootloader ever put the data
> above 4G? Does this really happen? Wouldn't it be easier to just say
> "bootloaders better put the kernel in the low 4G"?
I don't know much about bootloaders, but do we even have such guarantee
for in-kernel bootloader -- kexec?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
2018-01-23 17:37 ` Kirill A. Shutemov
@ 2018-01-23 18:13 ` Andi Kleen
0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2018-01-23 18:13 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Linus Torvalds, Kirill A. Shutemov, Ingo Molnar,
the arch/x86 maintainers, Thomas Gleixner, H. Peter Anvin,
Andy Lutomirski, Cyrill Gorcunov, Borislav Petkov, linux-mm,
Linux Kernel Mailing List
On Tue, Jan 23, 2018 at 08:37:03PM +0300, Kirill A. Shutemov wrote:
> On Tue, Jan 23, 2018 at 09:31:16AM -0800, Linus Torvalds wrote:
> > On Tue, Jan 23, 2018 at 9:09 AM, Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > >
> > > But if the bootloader put the kernel above 4G (not sure if anybody does
> > > this), we would lose control as soon as paging is disabled, because the
> > > code becomes unreachable to the CPU.
> >
> > I do wonder if we need this. Why would a bootloader ever put the data
> > above 4G? Does this really happen? Wouldn't it be easier to just say
> > "bootloaders better put the kernel in the low 4G"?
>
> I don't know much about bootloaders, but do we even have such guarantee
> for in-kernel bootloader -- kexec?
There's no such guarantee, so we need it at least for kexec.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-01-23 18:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-23 17:09 [PATCHv6, RESEND 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 2/4] x86/boot/compressed/64: Introduce paging_prepare() Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 3/4] x86/boot/compressed/64: Prepare trampoline memory Kirill A. Shutemov
2018-01-23 17:09 ` [PATCHv6, RESEND 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
2018-01-23 17:31 ` Linus Torvalds
2018-01-23 17:37 ` Kirill A. Shutemov
2018-01-23 18:13 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox