From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 134FBE77188 for ; Tue, 14 Jan 2025 09:20:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D868280002; Tue, 14 Jan 2025 04:20:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 88752280001; Tue, 14 Jan 2025 04:20:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72935280002; Tue, 14 Jan 2025 04:20:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 54FAD280001 for ; Tue, 14 Jan 2025 04:20:35 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DB3411207C7 for ; Tue, 14 Jan 2025 09:20:34 +0000 (UTC) X-FDA: 83005511988.10.2CD49D4 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf24.hostedemail.com (Postfix) with ESMTP id 2D120180009 for ; Tue, 14 Jan 2025 09:20:31 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FEIFCSd6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=YtFAYtrL; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="vATNFP/9"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=oYtkKC5L; dmarc=none; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736846432; a=rsa-sha256; cv=none; b=6QVf5hUdN+EKg25s5nlw4lGUUU5RvGuNGBapwRmwBjXCRyODwa2CsBt6TJUJCB/gFSy2Px 8YbEbcLFSH0drAqkTqu+aurvYXN2pwtWmFzDqFZUUntPwXulzfkOp/15kfWFhm52ag4gUT 1aMoQhx+3MBRkvTULG5kexJxjwUTJQ8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FEIFCSd6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=YtFAYtrL; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="vATNFP/9"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=oYtkKC5L; dmarc=none; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736846432; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sLwnpEJeRQdb9fQnsMPvYMTeIFytxGMksBe6Iwg/9w0=; b=Le+FxkknyY6rg1lJfyGvo8YD84KYM+WF00dJc14i/efwQMDkZUJ3gg9Fuyp4RqnNfxLzc9 H1G+cwAtUBmAk30J4DUmARfGLI8mTN2z9ZCz9Ar3b9xwVoeBV8SNOF9YW1zKPXEPxIrigo uRkMPrrIhkvakwdBF12m9GL98G1HZog= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 83560210FE; Tue, 14 Jan 2025 09:20:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1736846430; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=sLwnpEJeRQdb9fQnsMPvYMTeIFytxGMksBe6Iwg/9w0=; b=FEIFCSd6PR+0M68BNt1lwjPzIrWGKDOYMRZTO6noifS2TPNSdxsmuJN35UFcfZ0n6A4Bt3 4p+selWYVGGcGp5eNclUGoFeQF3gQkg/c1GRmMKyJR7D/xI5bJ+jlI+M3rm+9D9cSBnvMc 7b/ooWpkaM7uLLFxuf3YDEzSfWH7w68= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1736846430; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=sLwnpEJeRQdb9fQnsMPvYMTeIFytxGMksBe6Iwg/9w0=; b=YtFAYtrL+JbiEENl/qQPoI++mFtWalwulzWqCDlNRyj2oxjBL2mwGUVrTbi0JNmTNuWV3V 7vdMFhHMY7Aa3sDA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1736846429; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=sLwnpEJeRQdb9fQnsMPvYMTeIFytxGMksBe6Iwg/9w0=; b=vATNFP/9XGMzqibn/oqsW0Su7hQP3BzZuZWxkH343Rp6PR1Za54J7eDhcUJdnd9FAd7Um2 HxzyH9ao9j5m6DHKBNMgI2DskaZZJE478PvlQxh+O3/5RsjMqMD7vyyoUnWCLSGAwTyLX/ SduqHLJxvc0ARVc9Zaa+wEfn4EMlayw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1736846429; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=sLwnpEJeRQdb9fQnsMPvYMTeIFytxGMksBe6Iwg/9w0=; b=oYtkKC5LVKEPg8M9dYTzbFSMWTz3/AQD5VfBES6kLi9ZMdMxxQFEhhd7Es99NMVFejzjia /J9u23lzQuydz5BQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 59660139CB; Tue, 14 Jan 2025 09:20:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 0XoCFV0shmdWdQAAD6G6ig (envelope-from ); Tue, 14 Jan 2025 09:20:29 +0000 Message-ID: Date: Tue, 14 Jan 2025 10:20:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] memcg: fix soft lockup in the OOM process Content-Language: en-US To: Michal Hocko , Andrew Morton Cc: Chen Ridong , hannes@cmpxchg.org, yosryahmed@google.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, davidf@vimeo.com, handai.szj@taobao.com, rientjes@google.com, kamezawa.hiroyu@jp.fujitsu.com, RCU , linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, chenridong@huawei.com, wangweiyang2@huawei.com References: <20241224025238.3768787-1-chenridong@huaweicloud.com> <1ea309c1-d0f8-4209-b0b0-e69ad4e986ae@suse.cz> <58caaa4f-cf78-4d0f-af31-8a9277b6ebf5@huaweicloud.com> <20250113194546.3de1af46fa7a668111909b63@linux-foundation.org> From: Vlastimil Babka Autocrypt: addr=vbabka@suse.cz; keydata= xsFNBFZdmxYBEADsw/SiUSjB0dM+vSh95UkgcHjzEVBlby/Fg+g42O7LAEkCYXi/vvq31JTB KxRWDHX0R2tgpFDXHnzZcQywawu8eSq0LxzxFNYMvtB7sV1pxYwej2qx9B75qW2plBs+7+YB 87tMFA+u+L4Z5xAzIimfLD5EKC56kJ1CsXlM8S/LHcmdD9Ctkn3trYDNnat0eoAcfPIP2OZ+ 9oe9IF/R28zmh0ifLXyJQQz5ofdj4bPf8ecEW0rhcqHfTD8k4yK0xxt3xW+6Exqp9n9bydiy tcSAw/TahjW6yrA+6JhSBv1v2tIm+itQc073zjSX8OFL51qQVzRFr7H2UQG33lw2QrvHRXqD Ot7ViKam7v0Ho9wEWiQOOZlHItOOXFphWb2yq3nzrKe45oWoSgkxKb97MVsQ+q2SYjJRBBH4 8qKhphADYxkIP6yut/eaj9ImvRUZZRi0DTc8xfnvHGTjKbJzC2xpFcY0DQbZzuwsIZ8OPJCc LM4S7mT25NE5kUTG/TKQCk922vRdGVMoLA7dIQrgXnRXtyT61sg8PG4wcfOnuWf8577aXP1x 6mzw3/jh3F+oSBHb/GcLC7mvWreJifUL2gEdssGfXhGWBo6zLS3qhgtwjay0Jl+kza1lo+Cv BB2T79D4WGdDuVa4eOrQ02TxqGN7G0Biz5ZLRSFzQSQwLn8fbwARAQABzSBWbGFzdGltaWwg QmFia2EgPHZiYWJrYUBzdXNlLmN6PsLBlAQTAQoAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIe AQIXgBYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJkBREIBQkRadznAAoJECJPp+fMgqZkNxIQ ALZRqwdUGzqL2aeSavbum/VF/+td+nZfuH0xeWiO2w8mG0+nPd5j9ujYeHcUP1edE7uQrjOC Gs9sm8+W1xYnbClMJTsXiAV88D2btFUdU1mCXURAL9wWZ8Jsmz5ZH2V6AUszvNezsS/VIT87 AmTtj31TLDGwdxaZTSYLwAOOOtyqafOEq+gJB30RxTRE3h3G1zpO7OM9K6ysLdAlwAGYWgJJ V4JqGsQ/lyEtxxFpUCjb5Pztp7cQxhlkil0oBYHkudiG8j1U3DG8iC6rnB4yJaLphKx57NuQ PIY0Bccg+r9gIQ4XeSK2PQhdXdy3UWBr913ZQ9AI2usid3s5vabo4iBvpJNFLgUmxFnr73SJ KsRh/2OBsg1XXF/wRQGBO9vRuJUAbnaIVcmGOUogdBVS9Sun/Sy4GNA++KtFZK95U7J417/J Hub2xV6Ehc7UGW6fIvIQmzJ3zaTEfuriU1P8ayfddrAgZb25JnOW7L1zdYL8rXiezOyYZ8Fm ZyXjzWdO0RpxcUEp6GsJr11Bc4F3aae9OZtwtLL/jxc7y6pUugB00PodgnQ6CMcfR/HjXlae h2VS3zl9+tQWHu6s1R58t5BuMS2FNA58wU/IazImc/ZQA+slDBfhRDGYlExjg19UXWe/gMcl De3P1kxYPgZdGE2eZpRLIbt+rYnqQKy8UxlszsBNBFsZNTUBCACfQfpSsWJZyi+SHoRdVyX5 J6rI7okc4+b571a7RXD5UhS9dlVRVVAtrU9ANSLqPTQKGVxHrqD39XSw8hxK61pw8p90pg4G /N3iuWEvyt+t0SxDDkClnGsDyRhlUyEWYFEoBrrCizbmahOUwqkJbNMfzj5Y7n7OIJOxNRkB IBOjPdF26dMP69BwePQao1M8Acrrex9sAHYjQGyVmReRjVEtv9iG4DoTsnIR3amKVk6si4Ea X/mrapJqSCcBUVYUFH8M7bsm4CSxier5ofy8jTEa/CfvkqpKThTMCQPNZKY7hke5qEq1CBk2 wxhX48ZrJEFf1v3NuV3OimgsF2odzieNABEBAAHCwXwEGAEKACYCGwwWIQSpQNQ0mSwujpkQ PVAiT6fnzIKmZAUCZAUSmwUJDK5EZgAKCRAiT6fnzIKmZOJGEACOKABgo9wJXsbWhGWYO7mD 8R8mUyJHqbvaz+yTLnvRwfe/VwafFfDMx5GYVYzMY9TWpA8psFTKTUIIQmx2scYsRBUwm5VI EurRWKqENcDRjyo+ol59j0FViYysjQQeobXBDDE31t5SBg++veI6tXfpco/UiKEsDswL1WAr tEAZaruo7254TyH+gydURl2wJuzo/aZ7Y7PpqaODbYv727Dvm5eX64HCyyAH0s6sOCyGF5/p eIhrOn24oBf67KtdAN3H9JoFNUVTYJc1VJU3R1JtVdgwEdr+NEciEfYl0O19VpLE/PZxP4wX PWnhf5WjdoNI1Xec+RcJ5p/pSel0jnvBX8L2cmniYnmI883NhtGZsEWj++wyKiS4NranDFlA HdDM3b4lUth1pTtABKQ1YuTvehj7EfoWD3bv9kuGZGPrAeFNiHPdOT7DaXKeHpW9homgtBxj 8aX/UkSvEGJKUEbFL9cVa5tzyialGkSiZJNkWgeHe+jEcfRT6pJZOJidSCdzvJpbdJmm+eED w9XOLH1IIWh7RURU7G1iOfEfmImFeC3cbbS73LQEFGe1urxvIH5K/7vX+FkNcr9ujwWuPE9b 1C2o4i/yZPLXIVy387EjA6GZMqvQUFuSTs/GeBcv0NjIQi8867H3uLjz+mQy63fAitsDwLmR EP+ylKVEKb0Q2A== In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 4gpbpghdscgk8hnaiik4e3twct4qej3j X-Rspam-User: X-Rspamd-Queue-Id: 2D120180009 X-Rspamd-Server: rspam08 X-HE-Tag: 1736846431-767564 X-HE-Meta: U2FsdGVkX19s96dz2M6DtKI1P5lJOtBQ2ggSSh8QlGBpvEHUxbEwO1BeEPExAj0PbsFrDQg2wAioAQYj6xtnUslhV6VWTMhF8UTlSAzRsLYdWLqZKy0mK8thENTNOMaSh3j5AzbLDrgKlV3tRJLswHwRP8wAV7wRHPvZxy654x5QjysGZtAw4pBNEy+FipzNpSj12ZE88wHWMfdOvpTD3B2wmsckcefqCxPM9IBYRLkbbUSgn2BbCt3uMu4kDPtvKVNGVFiRT6RSLx1096PzMad5gdxxakqdcKB10brDi9uV5PicoEqgf4QI+Fae0BYikjtlafs3WA2bz5Gke46zSbR0soMykjqjkug+0nDFV7qoCswqfG/gBGbfM4nZaJ46Lkt3XxitGNQSUDzzol+Xv5woKHWtADNi7qtVx/6JDescoUqfRufDkLWeFAnq2upalnD8Q5xyQikSBP0WXb6I4lJgkSwf2WmQSP/BzWr9kUzu21+SuUiEje/Tl8x91HS4e0iIZUj0Fj7SllREMckuanwtQrzFYXG7o29LW8PlKk52n9IGptQDg/636SHnFBrFXnOGPbAQm0Dq/0DMhBV9IXfqqt2Mctohp9vBqZ4lYnot5TiFI1RMfRzliOx74n5Bn9G2WGu6pViWryCZuetWIhbaEYbTkjG2VseCD0nLFAgPn+0L0wl16BR26u+tTDZnCKMQQJtDRwfLW7wdJa+QsetZw0FPYHP94SLhpQMiP4yjIQgtde3PoiuAHfzR7bEWFPe2t/mocHnGmDC4tAH0gXkYIISl0rpm/6qKpW6oB+Zi/BD24VlLmXkjPLuUScZXpYghh28dO5rlzqO0+DRaRR/mUWDWW19LXBQSwgwObQu16ESUdzUlhRy8TDxec6PcZ9njoz3jkAg9jGoGBbU3yh5ZnKqEhNp9iFD8s5JJxhL5TxXoSKP4jLrPvJfFsVeun7ufKyo3NJKk2V83EB/ dgabwjCF 4rOb2q5cAZCFGgDKodeQrvMJOhUk2gJYzGjTCQVTnT1+wj6EiNde4+n1+E6ue/i6ys4qIckVycftFzwgp/70lu0qk/aT8CZI9jfUMCNBv4GZ8CtBW9c62qnDaLIJM8JwaHiDBqryoV25eSmHbZcT8o9cEYaCEjIwDhJyi/9SB/mfJ+v4inFsSFuDs4qZk812vxR5zzURF8EcGVP81TLATkALhSCtpzUfiONl3xujXCpfZTLtYKbolmatatoEI39dGcFq6wvoSuFUkNdrObidWiRgWBJ6yd+WUlvf4N4PRccq7bhs4yuiuL6ZuDQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/14/25 09:40, Michal Hocko wrote: > On Mon 13-01-25 19:45:46, Andrew Morton wrote: >> On Mon, 13 Jan 2025 14:51:55 +0800 Chen Ridong wrote: >> >> > >> @@ -430,10 +431,15 @@ static void dump_tasks(struct oom_control *oc) >> > >> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); >> > >> else { >> > >> struct task_struct *p; >> > >> + int i = 0; >> > >> >> > >> rcu_read_lock(); >> > >> - for_each_process(p) >> > >> + for_each_process(p) { >> > >> + /* Avoid potential softlockup warning */ >> > >> + if ((++i & 1023) == 0) >> > >> + touch_softlockup_watchdog(); >> > > >> > > This might suppress the soft lockup, but won't a rcu stall still be detected? >> > >> > Yes, rcu stall was still detected. "was" or "would be"? I thought only the memcg case was observed, or was that some deliberate stress test of the global case? (or the pr_info() console stress test mentioned earlier, but created outside of the oom code?) >> > For global OOM, system is likely to struggle, do we have to do some >> > works to suppress RCU detete? >> >> rcu_cpu_stall_reset()? > > Do we really care about those? The code to iterate over all processes > under RCU is there (basically) since ever and yet we do not seem to have > many reports of stalls? Chen's situation is specific to memcg OOM and > touching the global case was mostly for consistency reasons. Then I'd rather not touch the global case then if it's theoretical? It's not even exactly consistent, given it's a cond_resched() in the memcg code (that can be eventually automatically removed once/if lazy preempt becomes the sole implementation), but the touch_softlockup_watchdog() would remain, while doing only half of the job?