From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06A6EE77188 for ; Tue, 14 Jan 2025 03:45:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BCB56B0085; Mon, 13 Jan 2025 22:45:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 76C9B6B0088; Mon, 13 Jan 2025 22:45:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65B00280001; Mon, 13 Jan 2025 22:45:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 45EA66B0085 for ; Mon, 13 Jan 2025 22:45:51 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id ABA2A1405D9 for ; Tue, 14 Jan 2025 03:45:50 +0000 (UTC) X-FDA: 83004668460.14.F52300B Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf11.hostedemail.com (Postfix) with ESMTP id D398240010 for ; Tue, 14 Jan 2025 03:45:48 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=l4xnWvRe; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736826348; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=32j4qN2PVdRVEYPxixlnJjNDbw8vZBuR2oJKOEdmdL4=; b=ZslD8sZoxzF0+eJU+PXCy02MfLPU0AKp8TyP7DY6bdG4LFj4JzoYTjmlayriZQeLtPdj4f 6Voc/K6Vb41cWEwDXZbpjfgmbe9QaNcahHO/jbi5+haSlJHG6uyU263bNS4g6MPKwybWY4 pNMYkemh4sFubPy/nJ/3m1r+0KJODZg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736826348; a=rsa-sha256; cv=none; b=wTn2K8BgBTc71jgUgi/FqSZX3W6Z+OCgk3pFQJoREk2epG5qxicaqnnhQI1RZ5HTOoWsGc 257ua/0N31L4nyZg++E2V9G7si0xNFqZ8Jiw9rpm38Ga7CRk8n8goDiPBKDxA2LjIEwbHO HwmnIpkx4MMUj866OqPToPyKa0zFM6k= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=l4xnWvRe; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id DA2F2A401D3; Tue, 14 Jan 2025 03:43:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 29ECDC4CEDF; Tue, 14 Jan 2025 03:45:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1736826347; bh=S9/mJqNvgVCYUx8ZpWZUpeA5W4YEkSbONpCP6gTCHdQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=l4xnWvRet4z0e7dQuQmyY2RR0PYjtekU1sqenPc/vtjMTz1CW8WrNiItDlMshUcfO HGtdjCnM3eoTmh4ZMcIIKDChRpX7Qa9U1xSfXOnU0cqiU/0MgS1+xFl76x0NrGnErl /n8ZQXikiCLz9+RSIog8tIPmKvWmQDj4cwemTAbc= Date: Mon, 13 Jan 2025 19:45:46 -0800 From: Andrew Morton To: Chen Ridong Cc: Vlastimil Babka , mhocko@kernel.org, hannes@cmpxchg.org, yosryahmed@google.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, davidf@vimeo.com, handai.szj@taobao.com, rientjes@google.com, kamezawa.hiroyu@jp.fujitsu.com, RCU , linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, chenridong@huawei.com, wangweiyang2@huawei.com Subject: Re: [PATCH v3] memcg: fix soft lockup in the OOM process Message-Id: <20250113194546.3de1af46fa7a668111909b63@linux-foundation.org> In-Reply-To: <58caaa4f-cf78-4d0f-af31-8a9277b6ebf5@huaweicloud.com> References: <20241224025238.3768787-1-chenridong@huaweicloud.com> <1ea309c1-d0f8-4209-b0b0-e69ad4e986ae@suse.cz> <58caaa4f-cf78-4d0f-af31-8a9277b6ebf5@huaweicloud.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D398240010 X-Stat-Signature: rprg85du5698h3fxo6t8uyx5bho5hhcq X-Rspam-User: X-HE-Tag: 1736826348-395931 X-HE-Meta: U2FsdGVkX1/py2mv60Yqi/Cjmus5/Shr3NtZrExpRrB8IOMxDFpR/Jgq4feyKaucScZpep4OUiwCWLEN33fZ3J6A7kCPYbtVYhJNkrzC0ngG6a3FGQtZ0FqZWfYQ5/M5nFYxGP6BMYwCYz1EVhMZWAPT/r0Dle2OnVzjFavmmv0j65p0dpN90O9NP7PmaRL8BTPgT5AWs5/svIv4+3hsIGt2bkAnyVLxvEUh0rCXz0tGh85T/T4r01DXwRKPWrUBHXgqlMrzO40EZx8Ofsqm/qov5LZxFuS7C3nJDTotXDHf25dEokL/7WjXCgGgWvqe6s00H7Y6tmXjVencRoswHKbWPJQzpmJN/F4Xkv/XsVve7ClOrQoN/Ne9xrxczIrUFaiupZTKOz1LmA3kNXWK3rIRJA5y5EnuZvfCuTeBirzDB48Ox4PUfnHq+4LeAt86IgCoMb5+ZbilJLhYDSiAo/eBOkAGM/iKXo5am24DVDaaFXmB5boiahi2sBjcWc2vugyf+slXc0VYLsnyxJnu1lSJXg+yCJaLfO5bBSurZUjNmqdT9MpYdm2/mlJHo0ItIyluKmPHyrW1tzcp0mamxtPSku9W94jUunWtUyW0xKoURv0kJZiHQSiPJ6fZEAwq7HpteKJs5Xjf7RBWkjUQKtf+cY965q43kchfnWk7HCYZ2vVSwBFK1T5Jg59gFYgrrUZcha+m5brBNMIKRtQwzJH/NSui4eQVKljgTJ6ufv2FQMT1/cDVzq5T6ozaSjjXo5i4z7tINOU5KjMnu1T5fxCCIIrb3yan/HbgBei7atrP0WIoRcQ28qmAfkzXTcgWCWS88bu/8cUsnJmLkWr1bD/yMUHz7KaS7CobPW0ohaVUe8+w9LPlV0/dDpDExN1YQjmPt//dD2udSMlgC5B2jcck8jcSUTl7H/qYCfuHAsxKPpU0axe94/5FlcXayyPaB4iCC5c2YzROkic9T1r gZvbcTtU K8wEcrcIGnqnvlMtOy6swTY+SCRDF6or/UUdXjZLtG3I+/uuIIhyglmlsz5jrevh7nvqaqRZIJC8ovioS6mSirtfXNVSVJNB7Mx6cV/I9YJWNEX5Bg+W5be/QIpR+dp5TuAPaew0HyzUN6CTIH+m4yQ4whR1/ixpkINPxu8PdQzpTetPk3gyHDOnc0/tBnT0AZfxyywLVTYOJJ+RkoYcW7cR+rSF0mZiOSWhImPloNMK1fHBl/N354r/I31wVl2HhTATUWg6cxILRpkAEkEcGowPSaJD+fztoG221+x90xKlKsE0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 13 Jan 2025 14:51:55 +0800 Chen Ridong wrote: > > > On 2025/1/6 16:45, Vlastimil Babka wrote: > > On 12/24/24 03:52, Chen Ridong wrote: > >> From: Chen Ridong > > > > +CC RCU > > > >> A soft lockup issue was found in the product with about 56,000 tasks were > >> in the OOM cgroup, it was traversing them when the soft lockup was > >> triggered. > >> > > ... > > >> @@ -430,10 +431,15 @@ static void dump_tasks(struct oom_control *oc) > >> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); > >> else { > >> struct task_struct *p; > >> + int i = 0; > >> > >> rcu_read_lock(); > >> - for_each_process(p) > >> + for_each_process(p) { > >> + /* Avoid potential softlockup warning */ > >> + if ((++i & 1023) == 0) > >> + touch_softlockup_watchdog(); > > > > This might suppress the soft lockup, but won't a rcu stall still be detected? > > Yes, rcu stall was still detected. > For global OOM, system is likely to struggle, do we have to do some > works to suppress RCU detete? rcu_cpu_stall_reset()?