From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 933281A275; Fri, 10 Oct 2025 00:51:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760057470; cv=none; b=Z/VNNpIlRKgpMXsb0Dk91G0100yA6PuCUuFqBCZV/1BenAtJna7lByg3ByYDc6xedaj5PwQ+bbFaBbZGFeBsHdfaer5mMX0HyMYvI7wDF9BeBRSeQ+26UoEDAACSzCGqjrxWDcQT+pwNa6hodoTSW5DvufmqKA7KUYQUUxskqAA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760057470; c=relaxed/simple; bh=A3UrofGwAstq0M2Uw/J4CU3Y1letTXoVhhSuZWoGGb8=; h=Date:From:To:Cc:Subject:Message-Id:In-Reply-To:References: Mime-Version:Content-Type; b=VIcx6jeqEOLAHTEOLuGbUcIbZQKWBi0qImk8Cz26SsFPTkOVtVNWcdVmnH9ej7QUt7S41OAV6MVOMk9X9nF8f5v306gkAcVCiRpzvKXBt/5fxPHurLeRVkNSPAyxc7m/dc2OxNZTmNavXOmK0FnSFeg2afZASJDyEJ5DQFNGv9E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=2JktYbzm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="2JktYbzm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24497C4CEE7; Fri, 10 Oct 2025 00:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1760057470; bh=A3UrofGwAstq0M2Uw/J4CU3Y1letTXoVhhSuZWoGGb8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=2JktYbzmhzcnUjZxn4TJsLrCa4CdQmyNAz48GhTm60ifXFoRmbMF9Tc/M9RRLbvJV OTvz7R6JSbLUT9mEhQjhUsNNq5Xpz4vLivUTBLXDvZAdv81wS0YgNddSVET7bP2mj+ E2cRLHzPEXFAzCsftZVpoY141RGvPOXXQkXJVlZA= Date: Thu, 9 Oct 2025 17:51:07 -0700 From: Andrew Morton To: Jinchao Wang Cc: Masami Hiramatsu , Peter Zijlstra , Mike Rapoport , Alexander Potapenko , Randy Dunlap , Marco Elver , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , "Liang, Kan" , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , Kees Cook , Alice Ryhl , Sami Tolvanen , Miguel Ojeda , Masahiro Yamada , Rong Xu , Naveen N Rao , David Kaplan , Andrii Nakryiko , Jinjie Ruan , Nam Cao , workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-mm@kvack.org, llvm@lists.linux.dev, Andrey Ryabinin , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , kasan-dev@googlegroups.com, "David S. Miller" , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v7 00/23] mm/ksw: Introduce real-time KStackWatch debugging tool Message-Id: <20251009175107.ee07228e3253afca5b487316@linux-foundation.org> In-Reply-To: <20251009105650.168917-1-wangjinchao600@gmail.com> References: <20251009105650.168917-1-wangjinchao600@gmail.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: workflows@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On Thu, 9 Oct 2025 18:55:36 +0800 Jinchao Wang wrote: > This patch series introduces KStackWatch, a lightweight debugging tool to detect > kernel stack corruption in real time. It installs a hardware breakpoint > (watchpoint) at a function's specified offset using `kprobe.post_handler` and > removes it in `fprobe.exit_handler`. This covers the full execution window and > reports corruption immediately with time, location, and a call stack. > > The motivation comes from scenarios where corruption occurs silently in one > function but manifests later in another, without a direct call trace linking > the two. Such bugs are often extremely hard to debug with existing tools. > These scenarios are demonstrated in test 3–5 (silent corruption test, patch 20). > > ... > > 20 files changed, 1809 insertions(+), 62 deletions(-) It's obviously a substantial project. We need to decide whether to add this to Linux. There are some really important [0/N] changelog details which I'm not immediately seeing: Am I correct in thinking that it's x86-only? If so, what's involved in enabling other architectures? Is there any such work in progress? What motivated the work? Was there some particular class of failures which you were persistently seeing and wished to fix more efficiently? Has this code (or something like it) been used in production systems? If so, by whom and with what results? Has it actually found some kernel bugs yet? If so, details please. Can this be enabled on production systems? If so, what is the measured runtime overhead?