From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AFD6C433DB for ; Thu, 4 Feb 2021 00:52:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F14AD64F4C for ; Thu, 4 Feb 2021 00:52:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F14AD64F4C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=cloudflare.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4D9476B0078; Wed, 3 Feb 2021 19:52:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 48AF46B007D; Wed, 3 Feb 2021 19:52:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37BF26B007E; Wed, 3 Feb 2021 19:52:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id 1F40B6B0078 for ; Wed, 3 Feb 2021 19:52:56 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DA9538249980 for ; Thu, 4 Feb 2021 00:52:55 +0000 (UTC) X-FDA: 77778760710.13.wing52_47034b3275d7 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id B643418140B60 for ; Thu, 4 Feb 2021 00:52:55 +0000 (UTC) X-HE-Tag: wing52_47034b3275d7 X-Filterd-Recvd-Size: 6489 Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Thu, 4 Feb 2021 00:52:55 +0000 (UTC) Received: by mail-lf1-f51.google.com with SMTP id h12so1964457lfp.9 for ; Wed, 03 Feb 2021 16:52:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=z0j8u/cgjNk63BrE1ROiiO7WIMH6hW9ZaraaxN/GjSE=; b=gVr/bTGSubf6YnOvvpVG//iNim87v02Skx3jR2rMxXBTcD+HF8G3xEAnQQx+0Q+npP pqDCNmdgNk9JsEEyyZ8pGBqPuhB0LWAtXIXpCnbR2QYDJ+XTkBpqezeM0VhEQL870N4F uJkM0MIlJHoRCuuzrpOOrJtOOSiKWEnnBpQbw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z0j8u/cgjNk63BrE1ROiiO7WIMH6hW9ZaraaxN/GjSE=; b=h3KAhirAPe745609GYOzlrlsVuA+v2bxfkYeyCOBl3mms8Fsh34DSRKKnJ95m0E2c1 hGQ4zeesyXUluBxFRUy10+siNwdkTvTyKlH/lQbDMC/TSE0cKfNPs6RqLqAJQcCc5Uwo q/1GoPc7xr1QUes7mJD5y+GCqwB8vVzwsBKTNoXlCn78nEAo1mwS/ByPupJiUvn7GdNS lW+ttWlSZdnCLMuXkIyZRv8RBjEpXdWkIy5CcX5SmGNyEP1sT+u/XIMbhCT9tZqIGTG7 HmSijA54jxenNH1X5Q4W1KUUHuhMCCM7lFUk0D59MnGe6zUOiu5ZaIR1Kz6aekgid+FQ y/UA== X-Gm-Message-State: AOAM533Zo3WioOioHaiFS/5bzi6FWByx1Z1V5DN5Zk771VPXX5RImqxS 4yEil7R0yW0ICr7CE0HnioDsDCQ6wip9mhppAPRh9Q== X-Google-Smtp-Source: ABdhPJzSprHSC/hLTMNA1akLup57Be6R1Xs+6lZxUgJvmJt7gToshQ1KQ2TSXBrV4DDz+jJi75zB2VS7Nf0bWHH7qGc= X-Received: by 2002:a05:6512:3190:: with SMTP id i16mr3254379lfe.200.1612399973566; Wed, 03 Feb 2021 16:52:53 -0800 (PST) MIME-Version: 1.0 References: <20210203190518.nlwghesq75enas6n@treble> <20210203232735.nw73kugja56jp4ls@treble> <20210204001700.ry6dpqvavcswyvy7@treble> In-Reply-To: <20210204001700.ry6dpqvavcswyvy7@treble> From: Ivan Babrou Date: Wed, 3 Feb 2021 16:52:42 -0800 Message-ID: Subject: Re: BUG: KASAN: stack-out-of-bounds in unwind_next_frame+0x1df5/0x2650 To: Josh Poimboeuf Cc: Peter Zijlstra , kernel-team , Ignat Korchagin , Hailong liu , Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Miroslav Benes , Julien Thierry , Jiri Slaby , kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, "Steven Rostedt (VMware)" , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , Robert Richter , "Joel Fernandes (Google)" , Mathieu Desnoyers , Linux Kernel Network Developers , bpf@vger.kernel.org, Alexey Kardashevskiy Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 3, 2021 at 4:17 PM Josh Poimboeuf wrote: > > On Wed, Feb 03, 2021 at 03:30:35PM -0800, Ivan Babrou wrote: > > > > > Can you recreate with this patch, and add "unwind_debug" to the cmdline? > > > > > It will spit out a bunch of stack data. > > > > > > > > Here's the three I'm building: > > > > > > > > * https://github.com/bobrik/linux/tree/ivan/static-call-5.9 > > > > > > > > It contains: > > > > > > > > * v5.9 tag as the base > > > > * static_call-2020-10-12 tag > > > > * dm-crypt patches to reproduce the issue with KASAN > > > > * x86/unwind: Add 'unwind_debug' cmdline option > > > > * tracepoint: Fix race between tracing and removing tracepoint > > > > > > > > The very same issue can be reproduced on 5.10.11 with no patches, > > > > but I'm going with 5.9, since it boils down to static call changes. > > > > > > > > Here's the decoded stack from the kernel with unwind debug enabled: > > > > > > > > * https://gist.github.com/bobrik/ed052ac0ae44c880f3170299ad4af56b > > > > > > > > See my first email for the exact commands that trigger this. > > > > > > Thanks. Do you happen to have the original dmesg, before running it > > > through the post-processing script? > > > > Yes, here it is: > > > > * https://gist.github.com/bobrik/8c13e6a02555fb21cadabb74cdd6f9ab > > It appears the unwinder is getting lost in crypto code. No idea what > this has to do with static calls though. Or maybe you're seeing > multiple issues. > > Does this fix it? It does for the dm-crypt case! But so does the following commit in 5.11 (and 5.10.12): * https://github.com/torvalds/linux/commit/ce8f86ee94?w=1 The reason I stuck to dm-crypt reproduction is that it reproduces reliably. We also have the following stack that doesn't touch any crypto: * https://gist.github.com/bobrik/40e2559add2f0b26ae39da30dc451f1e I cannot reproduce this one, and it took 2 days of uptime for it to happen. Is there anything I can do to help diagnose it? My goal is to enable multishot KASAN in our pre-production environment, but currently it sometimes starves TX queues on the NIC due to multiple reports in a row in an interrupt about unwind_next_frame, which disables network interface, which is not something we can tolerate.