From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE08EC433F5 for ; Mon, 11 Oct 2021 09:23:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D46860EB6 for ; Mon, 11 Oct 2021 09:23:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6D46860EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EA5B36B006C; Mon, 11 Oct 2021 05:23:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E56556B0071; Mon, 11 Oct 2021 05:23:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1E86900002; Mon, 11 Oct 2021 05:23:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id C2BC56B006C for ; Mon, 11 Oct 2021 05:23:35 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 887A61802EF2A for ; Mon, 11 Oct 2021 09:23:35 +0000 (UTC) X-FDA: 78683618790.16.36622C8 Received: from smtp-relay-internal-0.canonical.com (smtp-relay-internal-0.canonical.com [185.125.188.122]) by imf06.hostedemail.com (Postfix) with ESMTP id 0FE06801E4C4 for ; Mon, 11 Oct 2021 09:23:34 +0000 (UTC) Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id 2288D3FFE0 for ; Mon, 11 Oct 2021 09:23:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1633944213; bh=XGF6U6ImX48DlfSujlkFaxlWBKpZditGsN8NI04ojEA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:In-Reply-To; b=f2Bgj7okh4+CW5X2rle7bluAzzyeRcSHDVPxbZDREChzfAdamAEVqm6ZX2/LbutfB k8LF/KxG96ufXO74ZVH6k3Mv2rbr2lzUNBpYfslOJTvODmiJXwuMSIjhwTzxzWfxq4 Ep6ndn7Mb18iqLAIszyCc/jLn5YFrJwhQsLAC3jw9aREEe/XytUOxeSwpkV8md45aL K//HJJr+yDplwDBj/UVBfNnLbE0W2lcV71zEL5vWx9izbc2tmuiyZvxp64Y73zAs3q 2yt2DlKqUN/GXBdD/QdeczmDOWg5TwGKXqxoAHq1e1QY3G6P0Q14/7BXC9deJiIOy4 H0PGuZYt4kpUg== Received: by mail-ed1-f71.google.com with SMTP id p20-20020a50cd94000000b003db23619472so15343028edi.19 for ; Mon, 11 Oct 2021 02:23:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XGF6U6ImX48DlfSujlkFaxlWBKpZditGsN8NI04ojEA=; b=d2l7tu2QiJdxhBw8418OVT8+thhpMVQoCf5UMONzWk1+TeNLgINn95vA4xSfxObsAb s/9y202sNoC8AJUwkWbONOqGihtgB6f7XuIxssPQ0Ls5DkX9mOhgEMSbUS7GXwA2eV+/ QTv4WytXKoyrf9IRSAb5q/THvEWW6ETctf132BEsV1zeyaUHq/I/ktf8SRw6s4I18i2R l9poOmEgK2AOqCPDScmBaiTKW/8eMTe1RmGXMSPtjpbE9mbObpGhk0Q1aM1KB+P0MKMa 2OY+V+3qL3m+Tvyr450pgPPcjPvNnO6Ip8KaI+bxa33NdgWfpV5VOEQrZnxr6ulrzrIf QaXw== X-Gm-Message-State: AOAM532V7EVXRQnrRz6ufKUVfXcXELEsFEWleKgCsGd1vF6s+X2SxOlF jHe1MIRh1kr2B3wQG0SL3HKDrH/5HF9B+ers4rzOtoUgkK8xaxROT3cIGwODRIXXpP9hkzjBpAy Wuz1MP+4BN+bSTcTUJr0K/bpr/f0L X-Received: by 2002:a17:906:d937:: with SMTP id rn23mr24575972ejb.101.1633944212780; Mon, 11 Oct 2021 02:23:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyTivXi1FDEB3sKw2zNYp6Nq32GYu8WXO4hmJP5SEwnByDRHjQoqDYoiq1PP2Zr1QXgxo5RUQ== X-Received: by 2002:a17:906:d937:: with SMTP id rn23mr24575945ejb.101.1633944212505; Mon, 11 Oct 2021 02:23:32 -0700 (PDT) Received: from localhost ([2001:67c:1560:8007::aac:c1b6]) by smtp.gmail.com with ESMTPSA id g17sm3861642edv.72.2021.10.11.02.23.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Oct 2021 02:23:31 -0700 (PDT) Date: Mon, 11 Oct 2021 11:23:30 +0200 From: Andrea Righi To: Dmitry Vyukov Cc: Marco Elver , Alexander Potapenko , kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: BUG: soft lockup in __kmalloc_node() with KFENCE enabled Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0FE06801E4C4 X-Stat-Signature: 37uzikj44pzupazd779oww4omgn6c4x6 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=canonical.com header.s=20210705 header.b=f2Bgj7ok; dmarc=pass (policy=none) header.from=canonical.com; spf=pass (imf06.hostedemail.com: domain of andrea.righi@canonical.com designates 185.125.188.122 as permitted sender) smtp.mailfrom=andrea.righi@canonical.com X-HE-Tag: 1633944214-936242 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 11, 2021 at 09:19:48AM +0200, Dmitry Vyukov wrote: > On Mon, 11 Oct 2021 at 09:10, Andrea Righi wrote: > > > > On Mon, Oct 11, 2021 at 08:48:29AM +0200, Marco Elver wrote: > > > On Mon, 11 Oct 2021 at 08:32, Andrea Righi wrote: > > > > On Mon, Oct 11, 2021 at 08:00:00AM +0200, Marco Elver wrote: > > > > > On Sun, 10 Oct 2021 at 15:53, Andrea Righi wrote: > > > > > > I can systematically reproduce the following soft lockup w/ the latest > > > > > > 5.15-rc4 kernel (and all the 5.14, 5.13 and 5.12 kernels that I've > > > > > > tested so far). > > > > > > > > > > > > I've found this issue by running systemd autopkgtest (I'm using the > > > > > > latest systemd in Ubuntu - 248.3-1ubuntu7 - but it should happen with > > > > > > any recent version of systemd). > > > > > > > > > > > > I'm running this test inside a local KVM instance and apparently systemd > > > > > > is starting up its own KVM instances to run its tests, so the context is > > > > > > a nested KVM scenario (even if I don't think the nested KVM part really > > > > > > matters). > > > > > > > > > > > > Here's the oops: > > > > > > > > > > > > [ 36.466565] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [udevadm:333] > > > > > > [ 36.466565] Modules linked in: btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear psmouse floppy > > > > > > [ 36.466565] CPU: 0 PID: 333 Comm: udevadm Not tainted 5.15-rc4 > > > > > > [ 36.466565] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > > > > [...] > > > > > > > > > > > > If I disable CONFIG_KFENCE the soft lockup doesn't happen and systemd > > > > > > autotest completes just fine. > > > > > > > > > > > > We've decided to disable KFENCE in the latest Ubuntu Impish kernel > > > > > > (5.13) for now, because of this issue, but I'm still investigating > > > > > > trying to better understand the problem. > > > > > > > > > > > > Any hint / suggestion? > > > > > > > > > > Can you confirm this is not a QEMU TCG instance? There's been a known > > > > > issue with it: https://bugs.launchpad.net/qemu/+bug/1920934 > > > > > > > > It looks like systemd is running qemu-system-x86 without any "accel" > > > > options, so IIUC the instance shouldn't use TCG. Is this a correct > > > > assumption or is there a better way to check? > > > > > > AFAIK, the default is TCG if nothing else is requested. What was the > > > command line? > > > > This is the full command line of what systemd is running: > > > > /bin/qemu-system-x86_64 -smp 4 -net none -m 512M -nographic -vga none -kernel /boot/vmlinuz-5.15-rc4 -drive format=raw,cache=unsafe,file=/var/tmp/systemd-test.sI1nrh/badid.img -initrd /boot/initrd.img-5.15-rc4 -append root=/dev/sda1 rw raid=noautodetect rd.luks=0 loglevel=2 init=/lib/systemd/systemd console=ttyS0 selinux=0 SYSTEMD_UNIT_PATH=/usr/lib/systemd/tests/testdata/testsuite-14.units:/usr/lib/systemd/tests/testdata/units: systemd.unit=testsuite.target systemd.wants=testsuite-14.service systemd.wants=end.service > > > > And this is running inside a KVM instance (so a nested KVM scenario). > > Hi Andrea, > > I think you need to pass -enable-kvm to make it "nested KVM scenario", > otherwise it's TCG emulation. So, IIUC I shouldn't hit the QEMU TCG issue mentioned by Marco, right? > > You seem to use the default 20s stall timeout. FWIW syzbot uses 160 > secs timeout for TCG emulation to avoid false positive warnings: > https://github.com/google/syzkaller/blob/838e7e2cd9228583ca33c49a39aea4d863d3e36d/dashboard/config/linux/upstream-arm64-kasan.config#L509 > There are a number of other timeouts raised as well, some as high as > 420 seconds. I see, I'll try with these settings and see if I can still hit the soft lockup messages. Thanks, -Andrea