From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A954D0BB56 for ; Thu, 24 Oct 2024 03:28:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1618F6B0092; Wed, 23 Oct 2024 23:28:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1122C6B0093; Wed, 23 Oct 2024 23:28:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1BED6B0095; Wed, 23 Oct 2024 23:28:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D3FBC6B0092 for ; Wed, 23 Oct 2024 23:28:41 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0E22212050D for ; Thu, 24 Oct 2024 03:28:25 +0000 (UTC) X-FDA: 82707063474.14.0341CD2 Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by imf06.hostedemail.com (Postfix) with ESMTP id B28C9180014 for ; Thu, 24 Oct 2024 03:28:26 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=R2HQVPq9; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729740351; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Un2VLODaIM3gpoEyTkOWWCAX4WQM40kMawARGDkWOxk=; b=3ADWgtcWwmO4sIPDROOKkrSPsj2P/9cKyYOGzn4GwDG7HLUzjMc0zzW0YbC8Iu5I32pQjz RcKCXMl/4RbkzQIdHD5Fy5khaDoIDzTVtdmlDjGdkyHq9c2Kt5rafFTVtvda9vAUOUDxwG ZyRD4D5N9N7B/k1oUeWL3TA0GYUgPMU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729740351; a=rsa-sha256; cv=none; b=7BT+BZDhud5PoIlxrS+2sWIoRWfksSoftXEh6/NftYedQONNAc/IauzlwIICcqnmToS6Um C/mj36LsMgNe6NyEQxOXIA7/wOa7RMflNHzXDsh1o6lOEwQUQ0Jaofyc1HxVofGFxfKrpj YA0bbq4yYyn3AJOQZlgAZJg6UIh6bOo= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=R2HQVPq9; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=ioworker0@gmail.com Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-5c903f5bd0eso762292a12.3 for ; Wed, 23 Oct 2024 20:28:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729740518; x=1730345318; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Un2VLODaIM3gpoEyTkOWWCAX4WQM40kMawARGDkWOxk=; b=R2HQVPq92+FPmQFuG06vguN9sZklsZ91k7NVCdzpCpMf6VqJw6XM4IlfxMagMvzJzD 6SNwlswle85LC9XvQtSRSsHiK/kNfqKxnx5BZxegvIWMqmG67OqvKwb+oDVZM5leK5oD y2n8cWKkMdpby3TWttiE8A0u1AttAZEmtSCeRd1eGtSKaPDc5mX+UiYFQdySsJISO94i JV3SS7mG3wzL6FAy3YPu6YWShnVO/JTjuAy72fc4jz0qI8AuSOsOgbidKB0uHu2ocgc5 WruAbEEJjCOnN8Ar6gBTsEFInxffa+MJSxQNoPTg+VBJJtd5YNq5tOAzpTsrVFVgXn/J Xswg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729740518; x=1730345318; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Un2VLODaIM3gpoEyTkOWWCAX4WQM40kMawARGDkWOxk=; b=brV/upu9b/CC74+58C2d0dV1Uh1IVhc6mdnWlEmEuJreViOSQfsjniyIHdVG5EFcbG eGcHrxP2eK6mKqw9MXcjL+A2qbuPS8dROoyL/TibHuaKJKuCcDTVFGcsFEJxd5FLpGfT u5MazC2HPgaYZ4xyCAFnWI837ninc7m6BGB7nc8i9U0bCVUdH4bpuyrtpH3A5nUjaCIz SuXoInQIZ7QIozEdc4rFV90JsCgBPSNdyF7JnB7WXBwq8GkxL2lUf2PWh9YnM4/1Yjzs +iu3tm+UNyV2WlQGahYRiFhwBYSk4cbHU0xm1/nwQX3OebO9iLtB32p4dkRKwK+6hR8u t93Q== X-Forwarded-Encrypted: i=1; AJvYcCUG2vzJlBhCIL5PuoWCmt4sPBqgZ90bMT/d16EJ9r99PE62bwQqecHLYW73622Bh0D1A4neAo2IGQ==@kvack.org X-Gm-Message-State: AOJu0Yw4u8KhvQk5Xb9ANrqItPIL3VxZCbL2OeSNkwZtaa3tsPdO6jBb JDVV1f7ftJY/bj3CPGHonmuW8OTB9q5ZsA+efiJsxk2tyEflkmooGSUrpf9RTvxqG/O4geK12O8 FnBMDJIXsyjWRdMXaME1dIJoOrQI= X-Google-Smtp-Source: AGHT+IF/fOrTtdpuif/P34flYq1CD1Pj6GgfXZSHY2zOJsitJU3I8Medv9lWBstqnoUR/90eL1OkWGG4bQHAuNt3rjY= X-Received: by 2002:a05:6402:2745:b0:5cb:6841:ec8a with SMTP id 4fb4d7f45d1cf-5cb8acee2a1mr4549755a12.19.1729740517751; Wed, 23 Oct 2024 20:28:37 -0700 (PDT) MIME-Version: 1.0 References: <20241022114736.83285-1-ioworker0@gmail.com> <20241023190515.a80c77fe3fa895910d554888@linux-foundation.org> In-Reply-To: <20241023190515.a80c77fe3fa895910d554888@linux-foundation.org> From: Lance Yang Date: Thu, 24 Oct 2024 11:28:01 +0800 Message-ID: Subject: Re: [PATCH 0/2] hung_task: add detect count for hung tasks To: Andrew Morton Cc: cunhuang@tencent.com, leonylgao@tencent.com, j.granados@samsung.com, jsiddle@redhat.com, kent.overstreet@linux.dev, 21cnbao@gmail.com, ryan.roberts@arm.com, david@redhat.com, ziy@nvidia.com, libang.li@antgroup.com, baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B28C9180014 X-Stat-Signature: ppg1yzywfmcqpoe13ipzc4sjmqqraua3 X-Rspam-User: X-HE-Tag: 1729740506-768346 X-HE-Meta: U2FsdGVkX1/2+fwQ4zoDMCsjxc4G19fYmDlG57Y0eAvCGujk762EUhBHoZCXdH9/olKs1Poyh+MPwfGsZ39zMEJWIw+xp8IpRLhRwA7bzdZZxOvF4YwhYLT8P70cKdM6dbRuTjrhmQRt0JKYEAsvWu7ASP4dVNdAXjMcYDEahshBUYEnc6siT8PXZ3jmaB2EdruWFdotloCK3w5PxBfDfd/Moz60NXQgKbRU9Dkan68FBYqCMhHe6XkojCFcZ/tW/KJ8zfLoqP/CpuN7ZBsZNpaa/ivu+o0UHgFvFhHEwMkGXNZLQ0lnX9rCJuP3j189ouS9rMCsFRuBl/kQq9qnIEddblA7O2RtB1P9rvGyyE2JvsX8nRgJrLyRYegJla7PXuYeNPVmwa8B2nmcHh5UpHp+keoAK0RVSRqKGuAPExwV88WysCuQjD1Tv+obdt9HDzaAisz/qmwE2MZLdrGP3X9L1NEXiziyNnzA3KXL8M1O/YLbzwWgOgTYNC41+/p/Hn9UycxoqWxOoEeBhI+8woAG0DY5Xstp2Z7Vx1CBD05mmvR46zYy0uNW6tUqjq+0PCbaQto+A45UPKUoeviONj2KhnMCUXwmJlyeBfE5Kt2C3DR+ervXCqX8r2sVUTyemMlm7ptKZ9hzZXwC03S5PfzvGFdXulTfusncG/OHOVB0HHy6MyrmAero/0DllKQeVlNhbmK+nvj1yhHom374xXwOmmFio13jQQp5jPrv0qjgGcPtAy1zGCqd2z6H5ITazLi3XvD4CJfv7+WybrNWWI8JKyfNbPxEcDzwcZLoqN5JaOhe7RuuTNQ865MKrVU8nTJvDiLF4u+nxBKZHMTbL/OHz4B1j8raHDh5oWj3UdCv8AMCguZKgCS4IkRR3FeB6jofjpvpaCyF2I/PXJgto9SIP76WsXHNF7hbO5xproYlKre36AnyJZofGCKBCfrbok7aBEEnxdZZv3qZsoR ClqY52H2 xCRHpHuvFRyvCaERfN60FDDbqXInNSQfUGuo5fLARJL/23+wrRsaOy557BINPDsgsEf7fsv4QHbzTvOSle+lTIxl47LlOIshbYmMA8YanSILjEiK/rnVmsNq+RT1oWHEaW0KjRyKGhTQhbfop+HbI77BtkrxataEXlyLl55b3b2obb9KBBE1VzQSrSy9R2+xMB4doB5fdjFAjnqmbjEMF5aM0BS+F6cwu4CiNt25cIYWtP3Ckuj1ZyPvmNBiYVAAazOqW34NWSZJrKaUt/4Tn5ohM+XGMr4v3MJgQar/VqMu6Drpg0TlT8mr+MfzYzkxjTO/hlfcx8hYn7zG0emiA1nwSC/Twys5Mnw3lirg+63JLf8+fibvEpC/zSpJPxPF4ZeOhyVCzqm/6Pqqp9P7XnfiFVT3UgovVzaJZRGjR3Ov0+9yGmyszn/43G1izBSqYsQWwF9qCn9fnLJ8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.084948, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Andrew, Thanks a lot for paying attention! On Thu, Oct 24, 2024 at 10:05=E2=80=AFAM Andrew Morton wrote: > > On Tue, 22 Oct 2024 19:47:34 +0800 Lance Yang wrote= : > > > Hi all, > > > > This patchset adds a counter, hung_task_detect_count, to track the numb= er of > > times hung tasks are detected. This counter provides a straightforward = way > > to monitor hung task events without manually checking dmesg logs. > > > > With this counter in place, system issues can be spotted quickly, allow= ing > > admins to step in promptly before system load spikes occur, even if the > > hung_task_warnings value has been decreased to 0 well before. > > > > Recently, we encountered a situation where warnings about hung tasks we= re > > buried in dmesg logs during load spikes. Introducing this counter could > > have helped us detect such issues earlier and improve our analysis effi= ciency. > > > > Isn't the answer to this problem "write a better parser"? I mean, Yeah, I certainly agree that having a good parser is important, and I'm working on that as well ;) > we're providing userspace with information which is already available. IHMO, there are two reasons why this counter remains valuable: 1) It allows us to easily detect hung tasks in time before load spikes occu= r, using simple and common monitoring tools like Prometheus. 2) It ensures that we remain aware of hung tasks even when the hung_task_warnings value has already been decreased to 0 well before. Thanks again for your time! Lance >