From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 520A6C433E0 for ; Fri, 5 Feb 2021 15:17:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C170964FC2 for ; Fri, 5 Feb 2021 15:17:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C170964FC2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 83FC46B0080; Fri, 5 Feb 2021 10:16:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 72A676B0081; Fri, 5 Feb 2021 10:16:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CAE96B0082; Fri, 5 Feb 2021 10:16:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 2A7926B0080 for ; Fri, 5 Feb 2021 10:16:53 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E56AD1730840 for ; Fri, 5 Feb 2021 15:16:52 +0000 (UTC) X-FDA: 77784566664.11.peace67_3d11df4275e5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 66781180F8B9D for ; Fri, 5 Feb 2021 15:16:49 +0000 (UTC) X-HE-Tag: peace67_3d11df4275e5 X-Filterd-Recvd-Size: 16349 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 15:16:48 +0000 (UTC) IronPort-SDR: ecXc4ay+fHDOmUJ8KJKpam+gF+shT2+6S2LQYXbYMr2Hl5lHd3vtXBwt1mnt6ZHLFQ1s+MaW57 tJYjaReA0G9w== X-IronPort-AV: E=McAfee;i="6000,8403,9885"; a="245516065" X-IronPort-AV: E=Sophos;i="5.81,155,1610438400"; d="scan'208";a="245516065" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2021 07:16:44 -0800 IronPort-SDR: ay6IjqaqC9ZS6KBSykKI9lL9o4Wo64XAw//iH9XjDccItbIuOIfc5jkt/2nZeMh0Y53k06VlKc CU6VFfZpCPgw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,155,1610438400"; d="scan'208";a="416226113" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga002.fm.intel.com with ESMTP; 05 Feb 2021 07:16:41 -0800 Received: by black.fi.intel.com (Postfix, from userid 1000) id B3ACE1C4; Fri, 5 Feb 2021 17:16:40 +0200 (EET) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , Catalin Marinas , Will Deacon , "H . J . Lu" , Andi Kleen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [QEMU] x86: Implement Linear Address Masking support Date: Fri, 5 Feb 2021 18:16:22 +0300 Message-Id: <20210205151631.43511-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205151631.43511-1-kirill.shutemov@linux.intel.com> References: <20210205151631.43511-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Linear Address Masking feature makes CPU ignore some bits of the virtual address. These bits can be used to encode metadata. The feature is enumerated with CPUID.(EAX=3D07H, ECX=3D01H):EAX.LAM[bit 2= 6]. CR3.LAM_U57[bit 62] allows to encode 6 bits of metadata in bits 62:57 of user pointers. CR3.LAM_U48[bit 61] allows to encode 15 bits of metadata in bits 62:48 of user pointers. CR4.LAM_SUP[bit 28] allows to encode metadata of supervisor pointers. If 5-level paging is in use, 6 bits of metadata can be encoded in 62:57. For 4-level paging, 15 bits of metadata can be encoded in bits 62:48. QEMU strips address from the metadata bits and gets it to canonical shape before handling memory access. It has to be done very early before TLB lookup. Signed-off-by: Kirill A. Shutemov --- accel/tcg/cputlb.c | 54 +++++++++++++++++++++++---------------- include/hw/core/cpu.h | 1 + target/i386/cpu.c | 5 ++-- target/i386/cpu.h | 7 +++++ target/i386/excp_helper.c | 28 +++++++++++++++++++- target/i386/helper.c | 2 +- 6 files changed, 71 insertions(+), 26 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 42ab79c1a582..f2d27134474f 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1271,6 +1271,17 @@ static inline ram_addr_t qemu_ram_addr_from_host_n= ofail(void *ptr) return ram_addr; } =20 +static vaddr clean_addr(CPUState *cpu, vaddr addr) +{ + CPUClass *cc =3D CPU_GET_CLASS(cpu); + + if (cc->do_clean_addr) { + addr =3D cc->do_clean_addr(cpu, addr); + } + + return addr; +} + /* * Note: tlb_fill() can trigger a resize of the TLB. This means that all= of the * caller's prior references to the TLB table (e.g. CPUTLBEntry pointers= ) must @@ -1702,9 +1713,11 @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong= addr, int mmu_idx, =20 /* Probe for a read-modify-write atomic operation. Do not allow unalign= ed * operations, or io operations to proceed. Return the host address. *= / -static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, +static void *atomic_mmu_lookup(CPUArchState *env, target_ulong address, TCGMemOpIdx oi, uintptr_t retaddr) { + CPUState *cpu =3D env_cpu(env); + target_ulong addr =3D clean_addr(cpu, address); size_t mmu_idx =3D get_mmuidx(oi); uintptr_t index =3D tlb_index(env, mmu_idx, addr); CPUTLBEntry *tlbe =3D tlb_entry(env, mmu_idx, addr); @@ -1720,8 +1733,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, t= arget_ulong addr, /* Enforce guest required alignment. */ if (unlikely(a_bits > 0 && (addr & ((1 << a_bits) - 1)))) { /* ??? Maybe indicate atomic op to cpu_unaligned_access */ - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, retaddr); + cpu_unaligned_access(cpu, addr, MMU_DATA_STORE, mmu_idx, retaddr= ); } =20 /* Enforce qemu required alignment. */ @@ -1736,8 +1748,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, t= arget_ulong addr, /* Check TLB entry and enforce page permissions. */ if (!tlb_hit(tlb_addr, addr)) { if (!VICTIM_TLB_HIT(addr_write, addr)) { - tlb_fill(env_cpu(env), addr, 1 << s_bits, MMU_DATA_STORE, - mmu_idx, retaddr); + tlb_fill(cpu, addr, 1 << s_bits, MMU_DATA_STORE, mmu_idx, re= taddr); index =3D tlb_index(env, mmu_idx, addr); tlbe =3D tlb_entry(env, mmu_idx, addr); } @@ -1753,8 +1764,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, t= arget_ulong addr, =20 /* Let the guest notice RMW on a write-only page. */ if (unlikely(tlbe->addr_read !=3D (tlb_addr & ~TLB_NOTDIRTY))) { - tlb_fill(env_cpu(env), addr, 1 << s_bits, MMU_DATA_LOAD, - mmu_idx, retaddr); + tlb_fill(cpu, addr, 1 << s_bits, MMU_DATA_LOAD, mmu_idx, retaddr= ); /* Since we don't support reads and writes to different addresse= s, and we do have the proper page loaded for write, this shouldn= 't ever return. But just in case, handle via stop-the-world. *= / @@ -1764,14 +1774,14 @@ static void *atomic_mmu_lookup(CPUArchState *env,= target_ulong addr, hostaddr =3D (void *)((uintptr_t)addr + tlbe->addend); =20 if (unlikely(tlb_addr & TLB_NOTDIRTY)) { - notdirty_write(env_cpu(env), addr, 1 << s_bits, + notdirty_write(cpu, addr, 1 << s_bits, &env_tlb(env)->d[mmu_idx].iotlb[index], retaddr); } =20 return hostaddr; =20 stop_the_world: - cpu_loop_exit_atomic(env_cpu(env), retaddr); + cpu_loop_exit_atomic(cpu, retaddr); } =20 /* @@ -1810,10 +1820,12 @@ load_memop(const void *haddr, MemOp op) } =20 static inline uint64_t QEMU_ALWAYS_INLINE -load_helper(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, +load_helper(CPUArchState *env, target_ulong address, TCGMemOpIdx oi, uintptr_t retaddr, MemOp op, bool code_read, FullLoadHelper *full_load) { + CPUState *cpu =3D env_cpu(env); + target_ulong addr =3D clean_addr(cpu, address); uintptr_t mmu_idx =3D get_mmuidx(oi); uintptr_t index =3D tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry =3D tlb_entry(env, mmu_idx, addr); @@ -1829,16 +1841,14 @@ load_helper(CPUArchState *env, target_ulong addr,= TCGMemOpIdx oi, =20 /* Handle CPU specific unaligned behaviour */ if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, access_type, - mmu_idx, retaddr); + cpu_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr); } =20 /* If the TLB entry is for a different page, reload and try again. = */ if (!tlb_hit(tlb_addr, addr)) { if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, - access_type, mmu_idx, retaddr); + tlb_fill(cpu, addr, size, access_type, mmu_idx, retaddr); index =3D tlb_index(env, mmu_idx, addr); entry =3D tlb_entry(env, mmu_idx, addr); } @@ -1861,7 +1871,7 @@ load_helper(CPUArchState *env, target_ulong addr, T= CGMemOpIdx oi, /* Handle watchpoints. */ if (unlikely(tlb_addr & TLB_WATCHPOINT)) { /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, + cpu_check_watchpoint(cpu, addr, size, iotlbentry->attrs, BP_MEM_READ, retaddr= ); } =20 @@ -2341,9 +2351,11 @@ store_helper_unaligned(CPUArchState *env, target_u= long addr, uint64_t val, } =20 static inline void QEMU_ALWAYS_INLINE -store_helper(CPUArchState *env, target_ulong addr, uint64_t val, +store_helper(CPUArchState *env, target_ulong address, uint64_t val, TCGMemOpIdx oi, uintptr_t retaddr, MemOp op) { + CPUState *cpu =3D env_cpu(env); + target_ulong addr =3D clean_addr(cpu, address); uintptr_t mmu_idx =3D get_mmuidx(oi); uintptr_t index =3D tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry =3D tlb_entry(env, mmu_idx, addr); @@ -2355,16 +2367,14 @@ store_helper(CPUArchState *env, target_ulong addr= , uint64_t val, =20 /* Handle CPU specific unaligned behaviour */ if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, retaddr); + cpu_unaligned_access(cpu, addr, MMU_DATA_STORE, mmu_idx, retaddr= ); } =20 /* If the TLB entry is for a different page, reload and try again. = */ if (!tlb_hit(tlb_addr, addr)) { if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, - mmu_idx, retaddr); + tlb_fill(cpu, addr, size, MMU_DATA_STORE, mmu_idx, retaddr); index =3D tlb_index(env, mmu_idx, addr); entry =3D tlb_entry(env, mmu_idx, addr); } @@ -2386,7 +2396,7 @@ store_helper(CPUArchState *env, target_ulong addr, = uint64_t val, /* Handle watchpoints. */ if (unlikely(tlb_addr & TLB_WATCHPOINT)) { /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, + cpu_check_watchpoint(cpu, addr, size, iotlbentry->attrs, BP_MEM_WRITE, retadd= r); } =20 @@ -2406,7 +2416,7 @@ store_helper(CPUArchState *env, target_ulong addr, = uint64_t val, =20 /* Handle clean RAM pages. */ if (tlb_addr & TLB_NOTDIRTY) { - notdirty_write(env_cpu(env), addr, size, iotlbentry, retaddr= ); + notdirty_write(cpu, addr, size, iotlbentry, retaddr); } =20 haddr =3D (void *)((uintptr_t)addr + entry->addend); diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index 3d92c967fffa..64817bc10f1b 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -171,6 +171,7 @@ struct CPUClass { int reset_dump_flags; bool (*has_work)(CPUState *cpu); void (*do_interrupt)(CPUState *cpu); + vaddr (*do_clean_addr)(CPUState *cpu, vaddr addr); void (*do_unaligned_access)(CPUState *cpu, vaddr addr, MMUAccessType access_type, int mmu_idx, uintptr_t retaddr); diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 5a8c96072e41..f819f0673103 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -666,7 +666,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint3= 2_t vendor1, /* CPUID_7_0_ECX_OSPKE is dynamic */ \ CPUID_7_0_ECX_LA57) #define TCG_7_0_EDX_FEATURES 0 -#define TCG_7_1_EAX_FEATURES 0 +#define TCG_7_1_EAX_FEATURES CPUID_7_1_EAX_LAM #define TCG_APM_FEATURES 0 #define TCG_6_EAX_FEATURES CPUID_6_EAX_ARAT #define TCG_XSAVE_FEATURES (CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XGETBV1) @@ -997,7 +997,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORD= S] =3D { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, + NULL, NULL, "lam", NULL, NULL, NULL, NULL, NULL, }, .cpuid =3D { @@ -7290,6 +7290,7 @@ static void x86_cpu_common_class_init(ObjectClass *= oc, void *data) #ifdef CONFIG_TCG cc->tcg_initialize =3D tcg_x86_init; cc->tlb_fill =3D x86_cpu_tlb_fill; + cc->do_clean_addr =3D x86_cpu_clean_addr; #endif cc->disas_set_info =3D x86_disas_set_info; =20 diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 88e8586f8fb4..f8477e16685d 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -229,6 +229,9 @@ typedef enum X86Seg { #define CR0_AM_MASK (1U << 18) #define CR0_PG_MASK (1U << 31) =20 +#define CR3_LAM_U48 (1ULL << 61) +#define CR3_LAM_U57 (1ULL << 62) + #define CR4_VME_MASK (1U << 0) #define CR4_PVI_MASK (1U << 1) #define CR4_TSD_MASK (1U << 2) @@ -250,6 +253,7 @@ typedef enum X86Seg { #define CR4_SMEP_MASK (1U << 20) #define CR4_SMAP_MASK (1U << 21) #define CR4_PKE_MASK (1U << 22) +#define CR4_LAM_SUP (1U << 28) =20 #define DR6_BD (1 << 13) #define DR6_BS (1 << 14) @@ -796,6 +800,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS]; =20 /* AVX512 BFloat16 Instruction */ #define CPUID_7_1_EAX_AVX512_BF16 (1U << 5) +/* Linear Address Masking */ +#define CPUID_7_1_EAX_LAM (1U << 26) =20 /* CLZERO instruction */ #define CPUID_8000_0008_EBX_CLZERO (1U << 0) @@ -1924,6 +1930,7 @@ bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, = int size, MMUAccessType access_type, int mmu_idx, bool probe, uintptr_t retaddr); void x86_cpu_set_a20(X86CPU *cpu, int a20_state); +vaddr x86_cpu_clean_addr(CPUState *cpu, vaddr addr); =20 #ifndef CONFIG_USER_ONLY static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs) diff --git a/target/i386/excp_helper.c b/target/i386/excp_helper.c index 191471749fbf..edf8194574b2 100644 --- a/target/i386/excp_helper.c +++ b/target/i386/excp_helper.c @@ -406,7 +406,7 @@ static int handle_mmu_fault(CPUState *cs, vaddr addr,= int size, } =20 if (la57) { - pml5e_addr =3D ((env->cr[3] & ~0xfff) + + pml5e_addr =3D ((env->cr[3] & PG_ADDRESS_MASK) + (((addr >> 48) & 0x1ff) << 3)) & a20_mask; pml5e_addr =3D get_hphys(cs, pml5e_addr, MMU_DATA_STORE,= NULL); pml5e =3D x86_ldq_phys(cs, pml5e_addr); @@ -700,3 +700,29 @@ bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int = size, return true; #endif } + +static inline int64_t sign_extend64(uint64_t value, int index) +{ + int shift =3D 63 - index; + return (int64_t)(value << shift) >> shift; +} + +vaddr x86_cpu_clean_addr(CPUState *cs, vaddr addr) +{ + CPUX86State *env =3D &X86_CPU(cs)->env; + bool la57 =3D env->cr[4] & CR4_LA57_MASK; + + if (addr >> 63) { + if (env->cr[4] & CR4_LAM_SUP) { + return sign_extend64(addr, la57 ? 56 : 47); + } + } else { + if (env->cr[3] & CR3_LAM_U57) { + return sign_extend64(addr, 56); + } else if (env->cr[3] & CR3_LAM_U48) { + return sign_extend64(addr, 47); + } + } + + return addr; +} diff --git a/target/i386/helper.c b/target/i386/helper.c index 034f46bcc210..6c099443ce13 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -753,7 +753,7 @@ hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cs= , vaddr addr, } =20 if (la57) { - pml5e_addr =3D ((env->cr[3] & ~0xfff) + + pml5e_addr =3D ((env->cr[3] & PG_ADDRESS_MASK) + (((addr >> 48) & 0x1ff) << 3)) & a20_mask; pml5e =3D x86_ldq_phys(cs, pml5e_addr); if (!(pml5e & PG_PRESENT_MASK)) { --=20 2.26.2