From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Ingo Molnar <mingo@redhat.com>,
x86@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
Kai Huang <kai.huang@linux.intel.com>,
Jacob Pan <jacob.jun.pan@linux.intel.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv5 00/19] MKTME enabling
Date: Tue, 17 Jul 2018 14:20:10 +0300 [thread overview]
Message-ID: <20180717112029.42378-1-kirill.shutemov@linux.intel.com> (raw)
Multikey Total Memory Encryption (MKTME)[1] is a technology that allows
transparent memory encryption in upcoming Intel platforms. See overview
below.
Here's updated version of my patchset that brings support of MKTME.
Please review and consider applying.
The patchset provides in-kernel infrastructure for MKTME, but doesn't yet
have userspace interface.
First 8 patches are for core-mm. The rest is x86-specific.
The patchset is on top of tip- tree plus page_ext cleanups I've posted
earlier[2]. page_ext cleanups are in -mm tree now.
Below is performance numbers for kernel build. Enabling MKTME doesn't
affect performance of non-encrypted memory allocation.
For encrypted memory allocation requires cache flush on allocation and
freeing encrypted memory. For kernel build it results in ~20% performance
degradation if we allocate all anonymous memory as encrypted.
We would need to maintain per-KeyID pool of free pages to minimize cache
flushing. I'm going to work on the optimization on top of this patchset.
The patchset also can be found here:
git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git mktme/wip
v5:
- Do not merge VMAs with different KeyID (for real).
- Do not use zero page in encrypted VMAs.
- Avoid division in __pa(). The division is replaced with masking which
makes it near-free. I was not able to measure difference comparing to
base line. direct_mapping_size now has to be power-of-2 if MKTME
enabled. Only in this case we can use masking there.
v4:
- Address Dave's feedback.
- Add performance numbers.
v3:
- Kernel now can access encrypted pages via per-KeyID direct mapping.
- Rework page allocation for encrypted memory to minimize overhead on
non-encrypted pages. It comes with cost for allocation of encrypted
pages: we have to flush cache on every time we allocate *and* free
encrypted page. We will need to optimize it later.
v2:
- Store KeyID of page in page_ext->flags rather than in anon_vma.
anon_vma approach turned out to be problematic. The main problem is
that anon_vma of the page is no longer stable after last mapcount has
gone. We would like to preserve last used KeyID even for freed
pages as it allows to avoid unnecessary cache flushing on allocation
of an encrypted page. page_ext serves this well enough.
- KeyID is now propagated through page allocator. No need in GFP_ENCRYPT
anymore.
- Patch "Decouple dynamic __PHYSICAL_MASK from AMD SME" has been fix to
work with AMD SEV (need to be confirmed by AMD folks).
------------------------------------------------------------------------------
MKTME is built on top of TME. TME allows encryption of the entirety of
system memory using a single key. MKTME allows to have multiple encryption
domains, each having own key -- different memory pages can be encrypted
with different keys.
Key design points of Intel MKTME:
- Initial HW implementation would support upto 63 keys (plus one default
TME key). But the number of keys may be as low as 3, depending to SKU
and BIOS settings
- To access encrypted memory you need to use mapping with proper KeyID
int the page table entry. KeyID is encoded in upper bits of PFN in page
table entry.
- CPU does not enforce coherency between mappings of the same physical
page with different KeyIDs or encryption keys. We wound need to take
care about flushing cache on allocation of encrypted page and on
returning it back to free pool.
- For managing keys, there's MKTME_KEY_PROGRAM leaf of the new PCONFIG
(platform configuration) instruction. It allows load and clear keys
associated with a KeyID. You can also ask CPU to generate a key for
you or disable memory encryption when a KeyID is used.
Performance numbers for kernel build:
Base (tip- tree):
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
5664711.936917 task-clock (msec) # 34.815 CPUs utilized ( +- 0.02% )
1,033,886 context-switches # 0.183 K/sec ( +- 0.37% )
189,308 cpu-migrations # 0.033 K/sec ( +- 0.39% )
104,951,554 page-faults # 0.019 M/sec ( +- 0.01% )
16,907,670,543,945 cycles # 2.985 GHz ( +- 0.01% )
12,662,345,427,578 stalled-cycles-frontend # 74.89% frontend cycles idle ( +- 0.02% )
9,936,469,878,830 instructions # 0.59 insn per cycle
# 1.27 stalled cycles per insn ( +- 0.00% )
2,179,100,082,611 branches # 384.680 M/sec ( +- 0.00% )
91,235,200,652 branch-misses # 4.19% of all branches ( +- 0.01% )
162.706797586 seconds time elapsed ( +- 0.04% )
CONFIG_X86_INTEL_MKTME=y, no encrypted memory:
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
5668508.245004 task-clock (msec) # 34.872 CPUs utilized ( +- 0.02% )
1,032,034 context-switches # 0.182 K/sec ( +- 0.90% )
188,098 cpu-migrations # 0.033 K/sec ( +- 1.15% )
104,964,084 page-faults # 0.019 M/sec ( +- 0.01% )
16,919,270,913,026 cycles # 2.985 GHz ( +- 0.02% )
12,672,067,815,805 stalled-cycles-frontend # 74.90% frontend cycles idle ( +- 0.02% )
9,942,560,135,477 instructions # 0.59 insn per cycle
# 1.27 stalled cycles per insn ( +- 0.00% )
2,180,800,745,687 branches # 384.722 M/sec ( +- 0.00% )
91,167,857,700 branch-misses # 4.18% of all branches ( +- 0.02% )
162.552503629 seconds time elapsed ( +- 0.10% )
CONFIG_X86_INTEL_MKTME=y, all anonymous memory encrypted with KeyID-1, pay
cache flush overhead on allocation and free:
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
7041851.999259 task-clock (msec) # 35.915 CPUs utilized ( +- 0.01% )
1,118,938 context-switches # 0.159 K/sec ( +- 0.49% )
197,039 cpu-migrations # 0.028 K/sec ( +- 0.80% )
104,970,021 page-faults # 0.015 M/sec ( +- 0.00% )
21,025,639,251,627 cycles # 2.986 GHz ( +- 0.01% )
16,729,451,765,492 stalled-cycles-frontend # 79.57% frontend cycles idle ( +- 0.02% )
10,010,727,735,588 instructions # 0.48 insn per cycle
# 1.67 stalled cycles per insn ( +- 0.00% )
2,197,110,181,421 branches # 312.007 M/sec ( +- 0.00% )
91,119,463,513 branch-misses # 4.15% of all branches ( +- 0.01% )
196.072361087 seconds time elapsed ( +- 0.14% )
[1] https://software.intel.com/sites/default/files/managed/a5/16/Multi-Key-Total-Memory-Encryption-Spec.pdf
[2] https://lkml.kernel.org/r/20180531135457.20167-1-kirill.shutemov@linux.intel.com
Kirill A. Shutemov (19):
mm: Do no merge VMAs with different encryption KeyIDs
mm: Do not use zero page in encrypted pages
mm/ksm: Do not merge pages with different KeyIDs
mm/page_alloc: Unify alloc_hugepage_vma()
mm/page_alloc: Handle allocation for encrypted memory
mm/khugepaged: Handle encrypted pages
x86/mm: Mask out KeyID bits from page table entry pfn
x86/mm: Introduce variables to store number, shift and mask of KeyIDs
x86/mm: Preserve KeyID on pte_modify() and pgprot_modify()
x86/mm: Implement page_keyid() using page_ext
x86/mm: Implement vma_keyid()
x86/mm: Implement prep_encrypted_page() and arch_free_page()
x86/mm: Rename CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING
x86/mm: Allow to disable MKTME after enumeration
x86/mm: Detect MKTME early
x86/mm: Calculate direct mapping size
x86/mm: Implement sync_direct_mapping()
x86/mm: Handle encrypted memory in page_to_virt() and __pa()
x86: Introduce CONFIG_X86_INTEL_MKTME
Documentation/x86/x86_64/mm.txt | 4 +
arch/alpha/include/asm/page.h | 2 +-
arch/s390/include/asm/pgtable.h | 2 +-
arch/x86/Kconfig | 21 +-
arch/x86/include/asm/mktme.h | 47 +++
arch/x86/include/asm/page.h | 1 +
arch/x86/include/asm/page_64.h | 4 +-
arch/x86/include/asm/pgtable_types.h | 15 +-
arch/x86/include/asm/setup.h | 6 +
arch/x86/kernel/cpu/intel.c | 32 +-
arch/x86/kernel/head64.c | 4 +
arch/x86/kernel/setup.c | 3 +
arch/x86/mm/Makefile | 2 +
arch/x86/mm/init_64.c | 68 ++++
arch/x86/mm/kaslr.c | 11 +-
arch/x86/mm/mktme.c | 546 +++++++++++++++++++++++++++
fs/userfaultfd.c | 7 +-
include/linux/gfp.h | 54 ++-
include/linux/migrate.h | 12 +-
include/linux/mm.h | 20 +-
include/linux/page_ext.h | 11 +-
mm/compaction.c | 1 +
mm/huge_memory.c | 3 +-
mm/khugepaged.c | 10 +
mm/ksm.c | 3 +
mm/madvise.c | 2 +-
mm/memory.c | 3 +-
mm/mempolicy.c | 31 +-
mm/migrate.c | 4 +-
mm/mlock.c | 2 +-
mm/mmap.c | 31 +-
mm/mprotect.c | 2 +-
mm/page_alloc.c | 47 +++
mm/page_ext.c | 3 +
34 files changed, 954 insertions(+), 60 deletions(-)
create mode 100644 arch/x86/include/asm/mktme.h
create mode 100644 arch/x86/mm/mktme.c
--
2.18.0
next reply other threads:[~2018-07-17 11:21 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-17 11:20 Kirill A. Shutemov [this message]
2018-07-17 11:20 ` [PATCHv5 01/19] mm: Do no merge VMAs with different encryption KeyIDs Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 02/19] mm: Do not use zero page in encrypted pages Kirill A. Shutemov
2018-07-18 17:36 ` Dave Hansen
2018-07-19 7:16 ` Kirill A. Shutemov
2018-07-19 13:58 ` Dave Hansen
2018-07-20 12:16 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 03/19] mm/ksm: Do not merge pages with different KeyIDs Kirill A. Shutemov
2018-07-18 17:38 ` Dave Hansen
2018-07-19 7:32 ` Kirill A. Shutemov
2018-07-19 14:02 ` Dave Hansen
2018-07-20 12:23 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 04/19] mm/page_alloc: Unify alloc_hugepage_vma() Kirill A. Shutemov
2018-07-18 17:43 ` Dave Hansen
2018-07-17 11:20 ` [PATCHv5 05/19] mm/page_alloc: Handle allocation for encrypted memory Kirill A. Shutemov
2018-07-18 23:03 ` Dave Hansen
2018-07-19 8:27 ` Kirill A. Shutemov
2018-07-19 14:05 ` Dave Hansen
2018-07-20 12:25 ` Kirill A. Shutemov
2018-07-26 14:25 ` Michal Hocko
2018-07-17 11:20 ` [PATCHv5 06/19] mm/khugepaged: Handle encrypted pages Kirill A. Shutemov
2018-07-18 23:11 ` Dave Hansen
2018-07-19 8:59 ` Kirill A. Shutemov
2018-07-19 14:13 ` Dave Hansen
2018-07-20 12:29 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 07/19] x86/mm: Mask out KeyID bits from page table entry pfn Kirill A. Shutemov
2018-07-18 23:13 ` Dave Hansen
2018-07-19 9:54 ` Kirill A. Shutemov
2018-07-19 14:19 ` Dave Hansen
2018-07-20 12:31 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 08/19] x86/mm: Introduce variables to store number, shift and mask of KeyIDs Kirill A. Shutemov
2018-07-18 23:19 ` Dave Hansen
2018-07-19 10:21 ` Kirill A. Shutemov
2018-07-19 12:37 ` Thomas Gleixner
2018-07-19 13:12 ` Kirill A. Shutemov
2018-07-19 13:18 ` Thomas Gleixner
2018-07-19 13:23 ` Kirill A. Shutemov
2018-07-19 13:40 ` Thomas Gleixner
2018-07-20 12:34 ` Kirill A. Shutemov
2018-07-20 13:17 ` Thomas Gleixner
2018-07-20 13:40 ` Kirill A. Shutemov
2018-07-19 14:23 ` Dave Hansen
2018-07-20 12:34 ` Kirill A. Shutemov
2018-07-31 0:08 ` Kai Huang
2018-07-17 11:20 ` [PATCHv5 09/19] x86/mm: Preserve KeyID on pte_modify() and pgprot_modify() Kirill A. Shutemov
2018-07-18 23:30 ` Dave Hansen
2018-07-20 12:42 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 10/19] x86/mm: Implement page_keyid() using page_ext Kirill A. Shutemov
2018-07-18 23:38 ` Dave Hansen
2018-07-23 9:45 ` Kirill A. Shutemov
2018-07-23 17:22 ` Alison Schofield
2018-07-17 11:20 ` [PATCHv5 11/19] x86/mm: Implement vma_keyid() Kirill A. Shutemov
2018-07-18 23:40 ` Dave Hansen
2018-07-23 9:47 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 12/19] x86/mm: Implement prep_encrypted_page() and arch_free_page() Kirill A. Shutemov
2018-07-18 23:53 ` Dave Hansen
2018-07-23 9:50 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 13/19] x86/mm: Rename CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 14/19] x86/mm: Allow to disable MKTME after enumeration Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 15/19] x86/mm: Detect MKTME early Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 16/19] x86/mm: Calculate direct mapping size Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 17/19] x86/mm: Implement sync_direct_mapping() Kirill A. Shutemov
2018-07-19 0:01 ` Dave Hansen
2018-07-23 10:04 ` Kirill A. Shutemov
2018-07-23 12:25 ` Dave Hansen
2018-07-17 11:20 ` [PATCHv5 18/19] x86/mm: Handle encrypted memory in page_to_virt() and __pa() Kirill A. Shutemov
2018-07-18 22:21 ` Thomas Gleixner
2018-07-23 10:12 ` Kirill A. Shutemov
2018-07-26 17:26 ` Dave Hansen
2018-07-27 13:49 ` Kirill A. Shutemov
2018-07-17 11:20 ` [PATCHv5 19/19] x86: Introduce CONFIG_X86_INTEL_MKTME Kirill A. Shutemov
2018-08-15 7:48 ` Pavel Machek
2018-08-17 9:24 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180717112029.42378-1-kirill.shutemov@linux.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=jacob.jun.pan@linux.intel.com \
--cc=kai.huang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox