From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 896A1C4CECE for ; Mon, 14 Oct 2019 03:33:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 391CC20869 for ; Mon, 14 Oct 2019 03:33:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 391CC20869 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 822F68E0005; Sun, 13 Oct 2019 23:33:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7AC9B8E0001; Sun, 13 Oct 2019 23:33:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64C458E0005; Sun, 13 Oct 2019 23:33:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id 3ABBF8E0001 for ; Sun, 13 Oct 2019 23:33:00 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id A74F52496 for ; Mon, 14 Oct 2019 03:32:59 +0000 (UTC) X-FDA: 76040968878.26.sock50_8e790c2ea261d X-HE-Tag: sock50_8e790c2ea261d X-Filterd-Recvd-Size: 11059 Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Mon, 14 Oct 2019 03:32:58 +0000 (UTC) Received: from mail-pl1-f200.google.com ([209.85.214.200]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iJr6P-0002le-Fs for linux-mm@kvack.org; Mon, 14 Oct 2019 03:32:57 +0000 Received: by mail-pl1-f200.google.com with SMTP id w11so9441313ply.6 for ; Sun, 13 Oct 2019 20:32:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:openpgp:autocrypt:to:cc:message-id :date:user-agent:mime-version:content-language :content-transfer-encoding; bh=qpfcAmKpl2b/HiWpJ3cfXiy7slVXGH0ea6iXSLmAGXs=; b=O+TeXiPgLOPnuiuy40650ugn7H01BKFCSPB1Maeo5zJmM8XfDYX3KidnOlRtWMx9Nm Rbgu4xnaSfFpacZTZN8ijB+sR4Qr31yEddtvVsxihVisxycBI5iLct3Tkbuqy8uJfKCl POhSolkypYpXcXF56gK+vq/IutmDJUQX0Ihz2m+BrMVpaWfKu7LASE9Z958ka9WfJrS1 FuvIcrGPZ1qVvuuOeikVEfkYxtEzo/MwTUIivunWyhD3gl/vovDYHIoNqZhExMJOy+sJ HRLVkVZq8rJY1Wu9w+1+adr5MdJJbY59OY71lip44UsqPYp+9FoFc4kcFHGEbtqhVYc/ qh9A== X-Gm-Message-State: APjAAAXDFXdA3Jat8NTvCiqvsLTmIz16yvBueQ1sOqPNui3PqUZzpBfE rsI1ka1Ufse9kwI5Lx89x2B00yVRHKaUe11HNF/l4AqLzbynJ/zigJm3A9BbVh+gvj8GUfRkakJ 4cGUFZT18NuWMFp4vbRXq5mEp23/U X-Received: by 2002:a17:902:8343:: with SMTP id z3mr13767097pln.70.1571023974472; Sun, 13 Oct 2019 20:32:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqxuXMABRDTnFj4gTpFpuLt6/Rcp/hakTHeKL0SFyHFS6muX1HtCmc2N6Mxh1+GUIrESBDxXHg== X-Received: by 2002:a17:902:8343:: with SMTP id z3mr13767064pln.70.1571023973966; Sun, 13 Oct 2019 20:32:53 -0700 (PDT) Received: from [192.168.0.239] ([177.183.163.179]) by smtp.gmail.com with ESMTPSA id i16sm13864868pfa.184.2019.10.13.20.32.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 13 Oct 2019 20:32:53 -0700 (PDT) From: "Guilherme G. Piccoli" Subject: Advice on oops - memory trap on non-memory access instruction (invalid CR2?) Openpgp: preference=signencrypt Autocrypt: addr=gpiccoli@canonical.com; prefer-encrypt=mutual; keydata= mQENBFpVBxcBCADPNKmu2iNKLepiv8+Ssx7+fVR8lrL7cvakMNFPXsXk+f0Bgq9NazNKWJIn Qxpa1iEWTZcLS8ikjatHMECJJqWlt2YcjU5MGbH1mZh+bT3RxrJRhxONz5e5YILyNp7jX+Vh 30rhj3J0vdrlIhPS8/bAt5tvTb3ceWEic9mWZMsosPavsKVcLIO6iZFlzXVu2WJ9cov8eQM/ irIgzvmFEcRyiQ4K+XUhuA0ccGwgvoJv4/GWVPJFHfMX9+dat0Ev8HQEbN/mko/bUS4Wprdv 7HR5tP9efSLucnsVzay0O6niZ61e5c97oUa9bdqHyApkCnGgKCpg7OZqLMM9Y3EcdMIJABEB AAG0LUd1aWxoZXJtZSBHLiBQaWNjb2xpIDxncGljY29saUBjYW5vbmljYWwuY29tPokBNwQT AQgAIQUCWmClvQIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRDOR5EF9K/7Gza3B/9d 5yczvEwvlh6ksYq+juyuElLvNwMFuyMPsvMfP38UslU8S3lf+ETukN1S8XVdeq9yscwtsRW/ 4YoUwHinJGRovqy8gFlm3SAtjfdqysgJqUJwBmOtcsHkmvFXJmPPGVoH9rMCUr9s6VDPox8f q2W5M7XE9YpsfchS/0fMn+DenhQpV3W6pbLtuDvH/81GKrhxO8whSEkByZbbc+mqRhUSTdN3 iMpRL0sULKPVYbVMbQEAnfJJ1LDkPqlTikAgt3peP7AaSpGs1e3pFzSEEW1VD2jIUmmDku0D LmTHRl4t9KpbU/H2/OPZkrm7809QovJGRAxjLLPcYOAP7DUeltveuQENBFpVBxcBCADbxD6J aNw/KgiSsbx5Sv8nNqO1ObTjhDR1wJw+02Bar9DGuFvx5/qs3ArSZkl8qX0X9Vhptk8rYnkn pfcrtPBYLoux8zmrGPA5vRgK2ItvSc0WN31YR/6nqnMfeC4CumFa/yLl26uzHJa5RYYQ47jg kZPehpc7IqEQ5IKy6cCKjgAkuvM1rDP1kWQ9noVhTUFr2SYVTT/WBHqUWorjhu57/OREo+Tl nxI1KrnmW0DbF52tYoHLt85dK10HQrV35OEFXuz0QPSNrYJT0CZHpUprkUxrupDgkM+2F5LI bIcaIQ4uDMWRyHpDbczQtmTke0x41AeIND3GUc+PQ4hWGp9XABEBAAGJAR8EGAEIAAkFAlpV BxcCGwwACgkQzkeRBfSv+xv1wwgAj39/45O3eHN5pK0XMyiRF4ihH9p1+8JVfBoSQw7AJ6oU 1Hoa+sZnlag/l2GTjC8dfEGNoZd3aRxqfkTrpu2TcfT6jIAsxGjnu+fUCoRNZzmjvRziw3T8 egSPz+GbNXrTXB8g/nc9mqHPPprOiVHDSK8aGoBqkQAPZDjUtRwVx112wtaQwArT2+bDbb/Y Yh6gTrYoRYHo6FuQl5YsHop/fmTahpTx11IMjuh6IJQ+lvdpdfYJ6hmAZ9kiVszDF6pGFVkY kHWtnE2Aa5qkxnA2HoFpqFifNWn5TyvJFpyqwVhVI8XYtXyVHub/WbXLWQwSJA4OHmqU8gDl X18zwLgdiQ== To: kvm@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, platform-driver-x86@vger.kernel.org, x86@kernel.org, iommu@lists.linux-foundation.org Cc: gpiccoli@canonical.com, "Guilherme G. Piccoli" , gavin.guo@canonical.com, halves@canonical.com, ioanna-maria.alifieraki@canonical.com, jay.vosburgh@canonical.com, mfo@canonical.com Message-ID: <66eeae28-bfd3-c7a0-011c-801981b74243@canonical.com> Date: Mon, 14 Oct 2019 00:32:38 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello kernel community, I'm investigating a recurrent problem, and hereby I'm seeking some advice - perhaps anybody reading this had similar issue, for example. I've iterated some mailing-lists I thought would be of interest, apologize if I miss any or if I shouldn't have included some. We have a kernel memory oops due to invalid read/write, but the trap happens in a non-memory access instruction. Example in [0] below. We can see a read access to offset 0x458, while it seems KVM was sending IPI. The "Code" line though (and EIP analysis with objdump in the vmlinux image) shows the trapping instruction as: 2b:*84 c0 test %al,%al <-- trapping instruction This instruction clearly shouldn't trap by invalid memory access. Also, this 0x458 offset seems not present in the code, based on assembly analysis done [1]. We had 3 or 4 more reports like this, some have invalid address on write (again #PF), some #GP - in all of them, the trapping insn is a non-memory related opcode. We understand x86 (should) have precise exceptions, so some hypothesis right now are related with: (a) Invalid CR2 - perhaps due to a System Management Interrupt, firmware code executed and caused an invalid memory access, polluting CR2. (b) Error in processor - there are some errata on Xeon processors, which Intel claims never were observed in commercial systems. (c) Error in kernel reporting when the oops happens - though we investigate this deeply, and the exception handlers are quite concise assembly routines that stacks processor generated data. (d) Some KVM/vAPIC related failure that may be caused by guest MMAPed APIC area bad access during interrupt virtualization. (e) Intel processor do not present precise interrupts. All of them are unlikely - maybe I'm not seeing something obvious, hence this advice request. Below there's a more detailed analysis of the registers of the aforementioned oops splat [2]. We are aware of the old version of kernel, unfortunately the user reporting this issue is unable to update right now. Any direction/suggestion/advice to obtain more data or prove/disprove some of our hypothesis is highly appreciated. Any questions are also appreciated, feel free to respond with any ideas you might have. Thanks, Guilherme -- [0] BUG: unable to handle kernel NULL pointer dereference at 0000000000000458 IP: [] kvm_irq_delivery_to_apic+0x56/0x220 [kvm] PGD 0 Oops: 0000 [#1] SMP Modules linked in: <...> CPU: 40 PID: 78274 Comm: qemu-system-x86 Tainted: P W OE 4.4.0-45-generic #66~14.04.1-Ubuntu Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.7 06/16/2016 task: ffff8800594dd280 ti: ffff880169168000 task.ti: ffff880169168000 RIP: 0010:[] [] kvm_irq_delivery_to_apic+0x56/0x220 [kvm] RSP: 0018:ffff88016916bbe8 EFLAGS: 00010282 RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000003 RDX: 0000000000000040 RSI: 0000000000000010 RDI: ffff88016916bba8 RBP: ffff88016916bc30 R08: 0000000000000004 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 00000000000008fd R13: 0000000000000004 R14: ffff88004d3e8000 R15: ffff88016916bc40 FS: 00007fbd67fff700(0000) GS:ffff881ffeb00000(0000) knlGS:0000000000000= 000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000458 CR3: 00000001961a9000 CR4: 00000000003426e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000000000001 0000000000000000 ffff882194b81400 0000000194b81410 0000000000000300 00000000000008fd 0000000000000004 ffff882194b81400 0000000000000001 ffff88016916bc78 ffffffffc0796d20 08000000000000fd Call Trace: [] apic_reg_write+0x110/0x5f0 [kvm] [] kvm_apic_write_nodecode+0x4b/0x60 [kvm] [] handle_apic_write+0x1e/0x30 [kvm_intel] [] vmx_handle_exit+0x288/0xbf0 [kvm_intel] [] vcpu_enter_guest+0x8b4/0x10a0 [kvm] [] ? kvm_vcpu_block+0x191/0x2d0 [kvm] [] ? prepare_to_wait_event+0xf0/0xf0 [] kvm_arch_vcpu_ioctl_run+0xc4/0x3d0 [kvm] [] kvm_vcpu_ioctl+0x2ab/0x640 [kvm] [] do_vfs_ioctl+0x2dd/0x4c0 [] ? __audit_syscall_entry+0xaf/0x100 [] ? do_audit_syscall_entry+0x66/0x70 [] SyS_ioctl+0x79/0x90 [] entry_SYSCALL_64_fastpath+0x16/0x75 Code: d4 ff ff ff ff 75 0d 81 7a 10 ff 00 00 00 0f 84 7d 01 00 00 4c 8b 45 c0 48 8b 75 c8 48 8d 4d d4 4c 89 fa 4c 89 f7 e8 ca be ff ff <84> c0 0f 85 0c 01 00 00 41 8b 86 f0 09 00 00 85 c0 0f 8e fd 00 RIP [] kvm_irq_delivery_to_apic+0x56/0x220 [kvm] RSP CR2: 0000000000000458 -- [1] Assembly analysis: https://pastebin.ubuntu.com/p/hdHNmvFtd8/ -- [2] More detailed analysis of registers: %rax =3D 1 [return from kvm_irq_delivery_to_apic_fast()] %rbx =3D 0x300 [ICR_LO register - this value comes from kvm_apic_write_nodecode(), in which the offset / register is assigned to %ebx. %rdi =3D &bitmap %rsi =3D 16 (0x10) from "for_each_set_bit(i, &bitmap, 16)" in function kvm_irq_delivery_to_apic_fast(). %rcx =3D i in above loop %rdx =3D 64 (0x40 - BITS_PER_LONG, set inside find_next_bit() in the abov= e loop) %r8 =3D 4 -> accumulates the return of kvm_apic_set_irq() - it means 4 IRQs were delivered successfully. It could have been zeroed in the process, and IRQs that were discarded don't accumulate here, so the value doesn't say much. %r14 =3D (struct kvm*) apic->vcpu->kvm %r15 =3D (kvm_lapic_irq*) irq [stack-like addr, as it came from apic_send_ipi(), in which irq is declared in stack - from the stack dump, it is 0xffffffffc0796d20] %r12 =3D apic->regs[ICR_LO] -> important register, describes the IPI data= ; value of 0x8fd means: bits 0-7 (vector): 253 bits 8-10 (delivery mode): 0 -> fixed bit 11 (destination logic): 1 -> logical bit 12 (delivery status): 0 -> idle bit 14 (level): 0 -> De-assert [oddity: Intel SDM vol 3 (10.6.1) claims this should be 1 in Xeon processors] bit 15 (trigger mode): 0 -> Edge bits 18-19 (shorthand): No %r13 =3D irq.dest_id =3D=3D apic->regs[ICR_HI] / some transformation of t= his register