From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8605DE77188 for ; Tue, 14 Jan 2025 08:40:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2581F280006; Tue, 14 Jan 2025 03:40:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E115280003; Tue, 14 Jan 2025 03:40:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A876280006; Tue, 14 Jan 2025 03:40:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DF3E5280003 for ; Tue, 14 Jan 2025 03:40:39 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8C9BFC0703 for ; Tue, 14 Jan 2025 08:40:39 +0000 (UTC) X-FDA: 83005411398.24.C0BE796 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) by imf06.hostedemail.com (Postfix) with ESMTP id 79160180008 for ; Tue, 14 Jan 2025 08:40:37 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RseLp9+N; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf06.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736844037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=arro8hqRUCXkL90V/UNNtQl0Bv9DZZdmFTRHHjOo6Ys=; b=rYKFK5evXhk/12E9Opk1aQRASGjpE0KFIuEOVJn/ViBWJ+8dZZQf0hSf5EhRSy6TTdUZFM d2s4WWUO9idJwSBcTYsYTnBgXHraT3Du1q2OmnuybI5xkc86Kd1d6s75puwTwfBdriTGHl uXw6TeHwlnuk5HrXYSNt9ZCdATvE5dM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736844037; a=rsa-sha256; cv=none; b=pjo+Y8z7aVRZdOAouGfiDD0byttuPy9Wavi7exdzVE+jI/CT+Jix8Sdicdr60mRaB7yx5j +0BICpQqyGAJDfvMF65+Kj8Q4p+8iy6C0H2lTmHzSAX/Fcqoea6tTyqRwIzVSsjc5Px3Xi 0Uf5ZH3rAwDrN6x79zF/yNEPZRCme6Y= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RseLp9+N; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf06.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=mhocko@suse.com Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-4363dc916ceso37689215e9.0 for ; Tue, 14 Jan 2025 00:40:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1736844036; x=1737448836; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=arro8hqRUCXkL90V/UNNtQl0Bv9DZZdmFTRHHjOo6Ys=; b=RseLp9+NF9Ci1QFMSR658b5Z+lmq7K/9pA5Y5aVXpU90CUB2Ki+bYmRaJL0/w7Al1o xVn/3H49ZO/LHjcutRKHIcHsDMkCMo66S8mAlR/GjhjQyt61uNWua92WSIAqNZpZaAsL ORIwosQqzUzkv2XcXbfB+/Gwr12c7i/qbsn10K/SGveSXEFX14Cm+jTpwrHz2JVcTe6V K6QLXoymCybwi1yGWqocm7nDIocXqt2JMcFFoqhgx5ZU+GquhM4lpUWEYzkmn7iNZDIG FOB2shnXdRQjSinFolN/sroub6XtKoN27LkuRME8O0087eBeub4RTHubysggjeVVG+HJ BrNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736844036; x=1737448836; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=arro8hqRUCXkL90V/UNNtQl0Bv9DZZdmFTRHHjOo6Ys=; b=uw+qcqgRc2LkhysBEONEvgEa5c3JQDA8Tkq79VR4hLD3jlfv2VNqnN+jRAY0JzjsJq p4NISvw27xVFMPvTKVivTMQhtPeKVI5MgxxQ8T1U5trxYeBZfeqjTQbXr2bLrTIyDYqL z+kipOjJe9a9OSgiacMBGusXgDDfwCGBRou/cSR9/ZusuI0JuCzp5LVrfMggvaU5m67x aS/R9rd3ue6q/ku6HdsMclweulXfkwnxqJPyirjyD5xBkbapyQy075NCrXnfZQMsYima ObtG96BOzpE8qdrP6SKIohAFk9ZZUabnH+dVRnYY4cAGvJHrSc2czR20IdN20Tc/yhrS mN8w== X-Forwarded-Encrypted: i=1; AJvYcCXjWBBlcIMSeRWooGZXsORqp6JNsZyTx8+HHro35KEap+2e4pWvndbY6I0j2rJVkZE0CYHVrMI+bQ==@kvack.org X-Gm-Message-State: AOJu0YzfVEqAify66HRjcIvKoDQYm9++lyMm3Q2KCp9sKdI2GApFmMCw ufgsybh8ZHcYfMjWU9JBdL+ASSA4LL5CFEdt2ZJffUAkm8s65OS2vCMmU5/3Lms= X-Gm-Gg: ASbGncu3JBBkFbY5VqsDAn25zIXu/M1lQDqFDHJPe+8lPArtIP6YWbvNQJBWbkMrqx9 O2BHcINvGzo6L3uTwanfukSEiYqCZOGKlgNKxgTkuJYJzsal61czoNqJrkXNdTfZA/2qZJMtd2X lQHL/dmb21lFLDfvF87PGprgBM224ePbx/tqX4NkBY59s20C67GgExlJX3p8jqag6HRodP4TZgg pwqDYONsHv84VK6wm99EIw2u7HOpoIb+zoKfQ8SpPYTiUgYvuftUwFjWU9B9/YVDOraBA== X-Google-Smtp-Source: AGHT+IE1cZ5Uou0ZikaojmMVGumlPj++UeBWrnDMQpxuLsFJ2giWup6lhK4Jzd72u8Y7fFKBkRcd3g== X-Received: by 2002:a05:6000:186e:b0:38a:8d32:272d with SMTP id ffacd0b85a97d-38a8d3229edmr17003016f8f.28.1736844036090; Tue, 14 Jan 2025 00:40:36 -0800 (PST) Received: from localhost (109-81-90-202.rct.o2.cz. [109.81.90.202]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38a8e4b80b2sm14026164f8f.80.2025.01.14.00.40.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2025 00:40:35 -0800 (PST) Date: Tue, 14 Jan 2025 09:40:34 +0100 From: Michal Hocko To: Andrew Morton Cc: Chen Ridong , Vlastimil Babka , hannes@cmpxchg.org, yosryahmed@google.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, davidf@vimeo.com, handai.szj@taobao.com, rientjes@google.com, kamezawa.hiroyu@jp.fujitsu.com, RCU , linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, chenridong@huawei.com, wangweiyang2@huawei.com Subject: Re: [PATCH v3] memcg: fix soft lockup in the OOM process Message-ID: References: <20241224025238.3768787-1-chenridong@huaweicloud.com> <1ea309c1-d0f8-4209-b0b0-e69ad4e986ae@suse.cz> <58caaa4f-cf78-4d0f-af31-8a9277b6ebf5@huaweicloud.com> <20250113194546.3de1af46fa7a668111909b63@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250113194546.3de1af46fa7a668111909b63@linux-foundation.org> X-Stat-Signature: rkp31fnehfda98zuxafsfxh87osxxw3k X-Rspamd-Queue-Id: 79160180008 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736844037-935226 X-HE-Meta: U2FsdGVkX19EoTgRxm5t+bzgLZP75SV6rHq6TKUX8hmNA6NxOiDDqymj0VbATy6926fxf5H5sSUfQxVQPs2AguRp2y/A2Wz6SdJGzOEPNv16BiVdkHeeLUsHWF+vgACGTR4xDCJJCBNIAznXEfBl2enCo9XX1dd7YBk0gDYo4Xah9oGECg/Xk1QuEnTUA7/rRxX+lstKrPXQBNDV/wfprMDZSbdAaj9bROSbQc7BLJO8A2ggyl6H4B1en+7phe1dWfrbyeH78bQHRdPPydr3d2myIKT9aqwaPY9bk5mTjk3kxdIoIC5ampaGj+TtM2D+GzJWENuok8WizSlYWm1T9zrAt7/s4XVQYwrT2jcnnV8z248nVJ9jP0r1JFv8VxchO71S9ZQ16eC4QfvZNfEW8UvYSgiRwaf5MmfNGHkdynZWj6CckRwQ4a7RLksg584GjU5LaZ9XHs5OFmOHAIDFETFBTqJMY0MKudqMIkjwwUM3M5dO6b4oVCV/LelPEzLVZvdCt5iZT0d03m56WAitJPq8X0xIW3+bZneneVpWXVtHJxGYNoBwpSlmPNg76x+q/8lLBL8rHeuGFJqJcaizK/zW5WzbaC/j0iw6z6x9eXkPAXp/5xh7oghAfhDBcf7BzdnXgiKfiPlLeELNMcMeZXoLnwV+RYF+/AGhqpdG8vrOugFhX1/KdPYXtNXZI75nsKGEEEjciFKNCrFPvPZO2C95j/qkGUfpdLdnTuSp2h3YK1ivPfMfHTdLJe2HcpD4eYaFzXgZ2qRaoiF+gjLya4mmZbBOoI0Dogk4xa3/IMmE+Xrp3RpP4ig0taNsMFIeBGP8H3XPIqKJ76xEqme6sK0gF5TuELURVD9ftHxF+13WcWAqzMLbvHMaf9MsWgf0C+yZUZZINFpfI5TYsSWjnpW7L0osIOY6AbJYgnPpqXdxFxWx9kVtrHXPlaX30QNUBYw0peFqOQz2lVkltJs 6aL2gAVN kWQ9Fhj1XmFvHZSRfrV8IMt242IVdg+ZSXP/xqpiGDfhZXHe5EC4BdSsiP5xDBPYwSj2BvywoRAMTR4XSsW7wtm4lg3rJ724Cpa12NkD2yOVSl1uiQ5NuGvsMmE6tT6GcG8iH4LZVeig8n3ow5TQXzzS8wtG4Inrd2tbe4n9X0iBAYV1zDmdOJx/nV1w+HMgg5vC68GrYCnW6yJI6/5wSWU4e8xGY9H7m7qc0qoJvXGcqUSGm5/0v+VYgnDKHttr55XeoQvsMUYdIi/ekdd67BcaSZYXZHLBhOziUO0nqIzvBBqJ2UlTRqgzT3SfD4jgKwJ31wuNosIEho0nCGxBo1l2g/zQItWkEELDa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 13-01-25 19:45:46, Andrew Morton wrote: > On Mon, 13 Jan 2025 14:51:55 +0800 Chen Ridong wrote: > > > > > > > On 2025/1/6 16:45, Vlastimil Babka wrote: > > > On 12/24/24 03:52, Chen Ridong wrote: > > >> From: Chen Ridong > > > > > > +CC RCU > > > > > >> A soft lockup issue was found in the product with about 56,000 tasks were > > >> in the OOM cgroup, it was traversing them when the soft lockup was > > >> triggered. > > >> > > > > ... > > > > >> @@ -430,10 +431,15 @@ static void dump_tasks(struct oom_control *oc) > > >> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); > > >> else { > > >> struct task_struct *p; > > >> + int i = 0; > > >> > > >> rcu_read_lock(); > > >> - for_each_process(p) > > >> + for_each_process(p) { > > >> + /* Avoid potential softlockup warning */ > > >> + if ((++i & 1023) == 0) > > >> + touch_softlockup_watchdog(); > > > > > > This might suppress the soft lockup, but won't a rcu stall still be detected? > > > > Yes, rcu stall was still detected. > > For global OOM, system is likely to struggle, do we have to do some > > works to suppress RCU detete? > > rcu_cpu_stall_reset()? Do we really care about those? The code to iterate over all processes under RCU is there (basically) since ever and yet we do not seem to have many reports of stalls? Chen's situation is specific to memcg OOM and touching the global case was mostly for consistency reasons. -- Michal Hocko SUSE Labs