From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13822C021B0 for ; Wed, 19 Feb 2025 20:19:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 533A5280268; Wed, 19 Feb 2025 15:19:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E45228025B; Wed, 19 Feb 2025 15:19:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35DC4280268; Wed, 19 Feb 2025 15:19:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1455A28025B for ; Wed, 19 Feb 2025 15:19:06 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 98714C0108 for ; Wed, 19 Feb 2025 20:19:05 +0000 (UTC) X-FDA: 83137808250.20.55A6F87 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 385EBC001A for ; Wed, 19 Feb 2025 20:19:03 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fYAgmS5V; spf=pass (imf22.hostedemail.com: domain of llong@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=llong@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739996343; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+AH4scGbKIzpZqSLnVZxg/gjNhNXB6eTJvEQUNi8u6Q=; b=J1vpmbbSj4cIvQ8vmKN8SoDNy94jCeefsKqzD0TukECaSEcRzGbcCj89sUzg1oefEDKsti kNzZlz9YTap09fguiC7IlRV84/NzkrhLMEslpemJwO+d+tltDmPLzJoIJ363FFZEmjWjmC HTLYRNNjXuPyLJhF90zC4wTTf0QeWEM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fYAgmS5V; spf=pass (imf22.hostedemail.com: domain of llong@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=llong@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739996343; a=rsa-sha256; cv=none; b=yZmhZbjcp477qVNoZ62wxFVZIdKLJHfm+aGez8xJYjqmq+34ztiwuSB14mOmt7tat0/E7t YLaUNPaa18Rt22TIuSrdY3ZObLMCuOpUgyvxhd7UExrao52aYT9jBhK2h1ErKM/mENYq0S J2A5RLWKmcZKtXCg2u8ofGgfn3NNWhc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739996342; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+AH4scGbKIzpZqSLnVZxg/gjNhNXB6eTJvEQUNi8u6Q=; b=fYAgmS5Vddt3+E6GUpE+c65IZnEEOYmTd8cB2quqZrY7FqCHqEzoURfJawfv34A+s/Fks3 gCUmpHamlhNwI53ZIwZYArCAbRUKjmbX4C01TE9HUbMnwZyHO5d3hOM6IdnfmxCPKzeU15 hNC4MmSvB33+k0tlCvLOIKvjZ3z+ROg= Received: from mail-oa1-f72.google.com (mail-oa1-f72.google.com [209.85.160.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-512-GmdvNUuDNk-8gnIzJFkmMA-1; Wed, 19 Feb 2025 15:19:01 -0500 X-MC-Unique: GmdvNUuDNk-8gnIzJFkmMA-1 X-Mimecast-MFC-AGG-ID: GmdvNUuDNk-8gnIzJFkmMA_1739996341 Received: by mail-oa1-f72.google.com with SMTP id 586e51a60fabf-2a8e3905c56so38277fac.2 for ; Wed, 19 Feb 2025 12:19:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739996341; x=1740601141; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:user-agent:mime-version:date:message-id:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+AH4scGbKIzpZqSLnVZxg/gjNhNXB6eTJvEQUNi8u6Q=; b=f+g46ylb3sVCXXi6d+xdPVOyMm72LP/yy4Qm0RLiL2Ika0f1zO1hQr853ga2cgdtg/ uOYTB7ZDn7+aG4SrFgwYX6z25vu+a4LyO0+3+6lRyHSQD270el9hSxU+ICJArTyrQ7qt hGYVrKsXMplY21jNEMbOGmCf37dTGONJAR/BqKCuYIzr4lC3NDNykCJr6L83BegTk9A0 arvc7RiVjnyzSBYBlE1l6qvkFgHiwmJAGXw0bXBJ0luRBuZuyCKvkUqzQds5PmN8c7cU hO1/yGnuRSPCqty0fM9JGkVGnywSzvuXm/CP1RiUUdS+HaCdSj9Bkt2ns3yVz32XMw7G R9EA== X-Forwarded-Encrypted: i=1; AJvYcCU9jtICFopBOhK6VOKHdB4Ul4EyrhrnGvwFHKjr/A4H6DwrW8DlFBpvJy6Cd6xM2gKaaTxJT40t+w==@kvack.org X-Gm-Message-State: AOJu0Yz+RgIocHuQc9b4FuTQ264f7q5sGQKJMnbe/VnHNI6036BDCulU 4IT1SWf093P1epbJ0JtpXd7tdN2Gwqpnj//+yyZ3ZRH7wsNMuoYMwtbD8D6tR8mfjZ6AZJOu6n9 ye5w1kdQB4zK88EcATwmJNKs+24aRzfLmE5OYm5uzqV8O4wcD X-Gm-Gg: ASbGncuXhewLXrT2rqjIr81S2rPAOleG2XeIlkyL5drdpwWHxCDgrHfArdbHwKa1ifa sugfpIPPQ9OqYf12O4Q8DyNSbfcqz2sVmts4W8PjsyO11ULoJEUD1WHqmSNSUX3PbKgDHTuHWD9 qrTCTVvCyEU0Kx+sZ4DwiwncTb3SqfVQMZ5X6RVQcGNKy7ZmmQk/cpCrCPLC6habUQsbuF2fmcN 07WjXJvZDeJ+EOLUJCe/ikMXTA/4uC0KgCd6OiuGMGAVKXJKY6R6dira0LaxYH2CNPmVHCTD+be 8fKoQH/2B5JZnv/5scfSBc2aGENVwSIaoZj4Od+nop2P27GN X-Received: by 2002:a05:6870:8998:b0:2b8:3a1f:6351 with SMTP id 586e51a60fabf-2bc99dcc5efmr12526874fac.34.1739996340659; Wed, 19 Feb 2025 12:19:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IFrJxX62AmjCm6onqMRcZ6OFMOdNVajrQy68F4HE2zaP9IEVg4qTVl8EutVYzFzQflHTieqAg== X-Received: by 2002:a05:6870:8998:b0:2b8:3a1f:6351 with SMTP id 586e51a60fabf-2bc99dcc5efmr12526841fac.34.1739996340309; Wed, 19 Feb 2025 12:19:00 -0800 (PST) Received: from ?IPV6:2601:188:c100:5710:627d:9ff:fe85:9ade? ([2601:188:c100:5710:627d:9ff:fe85:9ade]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-2bc69a3af4fsm5484323fac.19.2025.02.19.12.18.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 Feb 2025 12:18:59 -0800 (PST) From: Waiman Long X-Google-Original-From: Waiman Long Message-ID: <0fa9dd8e-2d83-487e-bfb1-1f5d20cd9fe6@redhat.com> Date: Wed, 19 Feb 2025 15:18:57 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] hung_task: Show the blocker task if the task is hung on mutex To: Steven Rostedt , "Masami Hiramatsu (Google)" Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , Andrew Morton , Boqun Feng , Joel Granados , Anna Schumaker , Lance Yang , Kent Overstreet , Yongliang Gao , Tomasz Figa , Sergey Senozhatsky , linux-kernel@vger.kernel.org, Linux Memory Management List References: <173997003868.2137198.9462617208992136056.stgit@mhiramat.tok.corp.google.com> <173997004932.2137198.7959507113210521328.stgit@mhiramat.tok.corp.google.com> <20250219112308.5d905680@gandalf.local.home> In-Reply-To: <20250219112308.5d905680@gandalf.local.home> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: kCIPUXSAMxjZ8astAR90BnfJ76e27cRNUYEQhWOR8gc_1739996341 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: ywgnrn5oajs97oh8r77r9dq5548sp8dg X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 385EBC001A X-HE-Tag: 1739996343-603937 X-HE-Meta: U2FsdGVkX19DY1k4EgQiGY4ChMl9N5CDdoAdO/2VbMGXzJKqnjWBfDMGf18VHYswNBAMxghdtqctUqZVCIiipHom4Ln3lpJdhIUGhSeqnW0Xc92GBPsMijk4QZ+zEW0176p+hX3g/8yXA/vmj+pXVIqoULoIt5JgDZLhTC/L4g88oHdNRUCZ5n5R2FmEqQxwbkmdVXV7LR2tMrqkOO/Q4pqYMFUZIjTcFiGuBajdtjMS4ri9dSRwHQDfwNPVWkJXkEi5xx9cYhV9Y+Fbt5yfBtLAdPohyRDSFVz/Uv/rkIMVLe0PIhEfEMjxc/qi6B/CgMAhk5lxyX870qgFipJylVl47+sFwhDuUAKB/IuxPpQvj5teRyzP7x/LxWhyZO/ieEQu1KpKJnIQhPfBmKMzIwvd4DaognWoM3JTUX4tYkmnL6Hw5AZTrIjNZknqM8GwocHWg7SkPM6yUihapIf5lFEh08AajIEZvy53ONC6qRRcS4YzSbZEx+pj/3BYytlWLeIDF7juIwLXAfQXbcnI0wRmhn347NmoCOZR7fDMVWg2xPcps7mNwXxCXXMqFJADWfbcwtGwXdIxReZz78Q0TWTLlMvIjBJsUIKCmJyrTlvJdt6oFMdwvcdRV8uy3mKn3KvvMHlEx2q1/rqzAQ9OwgidzYBpbPrPb7woui75k+HUcxyrGggOobAocgEVivhbCf5SQD6nmSut+Yfz8joqmImTwUn5WAdKV2RigVvUPwC2Cw1tP5xmJc+jFg3HyfpZzsY9g+fnYDDWzGRIWfk4hwdux2IYQAYB4k1tIKT1Oj3IAKZfPy6egxzNhL1fYKVv0/ZmRJaVXV7cR0rnAprlFrXiDsCRsbT6XVDdO5s36kvzQZ+y7bV15To+hEBLru6V5ZnLiM89Y1HH7IDPAf7ArVnoyor144Cll3Go5GcwlFlVGKUW6JF6khor+WkMWP0r+oTMdJTGdH1yod/vOy7 hOe5dQMl G4i7dmNutaN/qZeM0hAyhbyUwnqpAVr3FCIzj0hUhzATE0NbNamASSzCMZIlpYye6TTNFiK8xtHFPRJHb90L6UA/lu0lh6DG0mi9J2xySgwb7SkLR3yS1i1nwW23y4/tl6XLyxvToFsGZR8QGnvIFA/3Q6dQ9LIinKnWhexYZDXep8k2tRdip9iXaVrohuXsrcIXUZODe15EFt+wegZuhJFIgP97977JvGDOvIrBNoefucNxZROiABdWFIZOIW/K5Ll4pdJxhIMFFknNkNXrtBoaMVtm8oVl30mr7w/hATHwdMxBv/t0HnVAAsjTRTKg2+uc1xZt58xqKA5XLi3VU3CILqze+ltwXKWBz8LXh9YzQHiV16pmPGdVLlO1xN6UZaLDB4LBAB80nvOv9jgEmM3sT8OCVTLWm1CxxErSH8xm3DBaqY07JC1fYf2/PVGo4eu71VVbNyI/1/UkW4m0pMm/BuZfjKg164K5MZqQn/DUfFuQ/Hk2LaV+aCC+Eww1Lw2keNVYqBOFg8y2ATm8U/h0qyuOPEE1CZhCZlaIgHqFljsTw+R7ABiiD2gRB7JE2LjG1qAgpN9dlmawGeHKXTZ40lRdSyNtdKBVu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/19/25 11:23 AM, Steven Rostedt wrote: > On Wed, 19 Feb 2025 22:00:49 +0900 > "Masami Hiramatsu (Google)" wrote: > >> From: Masami Hiramatsu (Google) >> >> The "hung_task" shows a long-time uninterruptible slept task, but most >> often, it's blocked on a mutex acquired by another task. Without >> dumping such a task, investigating the root cause of the hung task >> problem is very difficult. >> >> Fortunately CONFIG_DEBUG_MUTEXES=y allows us to identify the mutex >> blocking the task. And the mutex has "owner" information, which can >> be used to find the owner task and dump it with hung tasks. >> >> With this change, the hung task shows blocker task's info like below; >> > We've hit bugs like this in the field a few times, and it was very > difficult to debug. Something like this would have made our lives much > easier! I agree that it will be a useful feature. >> Signed-off-by: Masami Hiramatsu (Google) >> --- >> kernel/hung_task.c | 38 ++++++++++++++++++++++++++++++++++++++ >> kernel/locking/mutex-debug.c | 1 + >> kernel/locking/mutex.c | 9 +++++++++ >> kernel/locking/mutex.h | 6 ++++++ >> 4 files changed, 54 insertions(+) >> >> diff --git a/kernel/hung_task.c b/kernel/hung_task.c >> index 04efa7a6e69b..d1ce69504090 100644 >> --- a/kernel/hung_task.c >> +++ b/kernel/hung_task.c >> @@ -25,6 +25,8 @@ >> >> #include >> >> +#include "locking/mutex.h" >> + >> /* >> * The number of tasks checked: >> */ >> @@ -93,6 +95,41 @@ static struct notifier_block panic_block = { >> .notifier_call = hung_task_panic, >> }; >> >> + >> +#ifdef CONFIG_DEBUG_MUTEXES >> +static void debug_show_blocker(struct task_struct *task) >> +{ >> + struct task_struct *g, *t; >> + unsigned long owner; >> + struct mutex *lock; >> + >> + if (!task->blocked_on) >> + return; >> + >> + lock = task->blocked_on->mutex; > This is a catch 22. To look at the task's blocked_on, we need the > lock->wait_lock held, otherwise this could be an issue. But to get that > lock, we need to look at the task's blocked_on field! As this can race. > > Another thing is that the waiter is on the task's stack. Perhaps we need to > move this into sched/core.c and be able to lock the task's rq. Because even > something like: > > waiter = READ_ONCE(task->blocked_on); > > May be garbage if the task were to suddenly wake up and run. > > Now if we were able to lock the task's rq, which would prevent it from > being woken up, then the blocked_on field would not be at risk of being > corrupted. It is tricky to access the mutex_waiter structure which is allocated from stack. So another way to work around this issue is to add a new blocked_on_mutex field in task_struct to directly point to relevant mutex. Yes, that increase the size of task_struct by 8 bytes, but it is a pretty large structure anyway. Using READ_ONCE/WRITE_ONCE() to access this field, we don't need to take lock, though taking the wait_lock may still be needed to examine other information inside the mutex. Cheers, Longman