From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA93DC43460 for ; Sun, 25 Apr 2021 06:42:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 83093613C3 for ; Sun, 25 Apr 2021 06:42:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83093613C3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sony.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E88E96B0036; Sun, 25 Apr 2021 02:42:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E611E6B006C; Sun, 25 Apr 2021 02:42:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D270C6B006E; Sun, 25 Apr 2021 02:42:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id B6FEE6B0036 for ; Sun, 25 Apr 2021 02:42:13 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 690D5180364F3 for ; Sun, 25 Apr 2021 06:42:13 +0000 (UTC) X-FDA: 78069944946.30.C4661BA Received: from JPTOSEGREL01.sonyericsson.com (jptosegrel01.sonyericsson.com [124.215.201.71]) by imf14.hostedemail.com (Postfix) with ESMTP id 09BB7C0007C6 for ; Sun, 25 Apr 2021 06:41:58 +0000 (UTC) Subject: Re: [RFC PATCH] watchdog: Adding softwatchdog To: Tetsuo Handa , Guenter Roeck , Wim Van Sebroeck , Andrew Morton , , , , Shakeel Butt References: <20210424102555.28203-1-peter.enderborg@sony.com> <20210424102555.28203-2-peter.enderborg@sony.com> <844e3ecb-62c3-856a-7273-e22eee35e80f@i-love.sakura.ne.jp> From: peter enderborg Message-ID: <1d4ef30a-69c5-c4dc-c3bd-8d7c0c99b3f3@sony.com> Date: Sun, 25 Apr 2021 08:42:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Language: en-GB X-SEG-SpamProfiler-Analysis: v=2.3 cv=DLnxHBFb c=1 sm=1 tr=0 a=fZcToFWbXLKijqHhjJ02CA==:117 a=IkcTkHD0fZMA:10 a=3YhXtTcJ-WEA:10 a=uherdBYGAAAA:8 a=VwQbUJbxAAAA:8 a=bVD-iCYiVT72kJ5F44QA:9 a=QEXdDO2ut3YA:10 a=Ef4yma5cpRUEJWN9UqBm:22 a=AjGcO6oz07-iQ99wixmX:22 X-SEG-SpamProfiler-Score: 0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 09BB7C0007C6 X-Stat-Signature: 78hnx8iqo5wkeb35db4n4j4z85mz1wst Received-SPF: none (sony.com>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=JPTOSEGREL01.sonyericsson.com; client-ip=124.215.201.71 X-HE-DKIM-Result: none/none X-HE-Tag: 1619332918-804938 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 4/25/21 3:08 AM, Tetsuo Handa wrote: > On 2021/04/25 1:19, peter enderborg wrote: >>> I don't think this proposal is a watchdog. I think this proposal is >>> a timer based process killer, based on an assumption that any slowdown >>> which prevents the monitor process from pinging for more than 0.5 secon= ds >>> (if HZ =3D=3D 1000) is caused by memory pressure. >> You missing the point. The oom killer is a example of a work that it can= do. >> it is one policy. The idea is that you should have a policy that fits yo= ur needs. > Implementing policy which can run in kernel from timer interrupt context = is > quite limited, for it is not allowed to perform operations that might sle= ep. See > > [RFC] memory reserve for userspace oom-killer > https://urldefense.com/v3/__https://lkml.kernel.org/r/CALvZod7vtDxJZtNh= n81V=3DoE-EPOf=3D4KZB2Bv6Giz*u3bFFyOLg@mail.gmail.com__;Kw!!JmoZiZGBv3RvKRS= x!tqBFKAdfydRJ5M0oP4xCRvSscrBwChj5MWuj1YUNAk05uORWkbcz-iodFCHYjKdOytmHoO4$= =20 > > for implementing possibly useful policy. I you need to do a more complex approach you might need to have a work queue.=C2=A0 For example a SIGTERM solution might be like that. You send sigterm wait some time and then send a sigkill. >> oom_score_adj is suitable for a android world. But it might be based on >> uid's if your priority is some users over other. Or a memcg. Or as >> Christophe Leroy want the current. The policy is only a example that >> fits a one area. > Horrible idea. Imagine a kernel module that randomly sends SIGTERM/SIGKIL= L > to "current" thread. How normal systems can survive? A normal system is n= ot > designed to survive random signals. I think you need to see it in the context of a watchdog. It might be problematic, but it has a good statistical change to hit a cpu hogger.=C2= =A0 And seeing as watchdog, the alternative is a system reset. You take a chance.=C2=A0 Reboot should be the last resort. I can imagine a kernel module that=C2=A0 randomly sends SIGTERM/SIGKILL, we already have that. It is called oom-kill. This is *exactly* the problem. > >> You need to describe your prioritization, in android it= is >> oom_score_adj. For example I would very much have a policy that sends >> sigterm instead of sigkill. > That's because Android framework is designed to survive random signals > (in order to survive memory pressure situation). It using a lot to control the system. It use it differently than you would with a shell or window-manager. > >> But the integration with oom is there becaus= e >> it is needed. Maybe a bad choice for political reasons but I don't it a >> good idea to hide the intention. Please don't focus on the oom part. > I wonder what system other than Android framework can utilize this module= . I think it will be useful for embedded systems as well. > By the way, there already is "Software Watchdog" ( drivers/watchdog/softd= og.c ) > which some people might call it "soft watchdog". It is very confusing to = name > your module as "softwatchdog". Please find a different name. > It is mention in the patch-set. I had as an idea to add this function to th= at one, but I decided that it was better to separate so point out the feature=C2=A0= that is to be "Soft" rather than so hard.