From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F78FC636CC for ; Mon, 13 Feb 2023 13:24:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA4446B0075; Mon, 13 Feb 2023 08:24:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C54606B0078; Mon, 13 Feb 2023 08:24:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1C136B007B; Mon, 13 Feb 2023 08:24:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A22476B0075 for ; Mon, 13 Feb 2023 08:24:49 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 703EC1C5E41 for ; Mon, 13 Feb 2023 13:24:49 +0000 (UTC) X-FDA: 80462338698.11.4E1F95E Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf22.hostedemail.com (Postfix) with ESMTP id 7C674C0002 for ; Mon, 13 Feb 2023 13:24:46 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=YPLiEzL7; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf22.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676294686; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CFeeD/FeoAftAse3SY9UYdhWJJuUONcBCw7GnWfvrC4=; b=pc76XZOzGj86fg0EMZAsrx3mY9Xlw2ijbsPqoruBeizwBN53Ap+IffeoeaGFMPIBBRyOtp O3L4ufGJ5b8One4TeXOutzEwELX2J2vX2jzY7XbcU6yg1V10M9xOfr0ZIZuWFNaCUAAhum ik7ymFpHzyyY3N/l+EUUR7gJcmV3TwI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=YPLiEzL7; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf22.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676294686; a=rsa-sha256; cv=none; b=23/gKTKlsdPA3IQdMpWYDLLdRn7GbwWHG9UL8kQ7/CKpX1r2sCAUUtTpfPXclRedGIPxzU AoqYXtfeEQ6Rtyv1Y1bPqCDC5Vnru9EwHh3xeyxYNaPqiOUTTzMBTaiZSrySTeIxKlY9Gi 7NY1frcDS8wGs4r/bIwqP5YUeoHIfcs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id EF8041F37C; Mon, 13 Feb 2023 13:24:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1676294684; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CFeeD/FeoAftAse3SY9UYdhWJJuUONcBCw7GnWfvrC4=; b=YPLiEzL7evBGnHW8y3jLpspKY2gaLR1UBaa7Zhq7w6xOrKiO+zNV9rANi6akR12tElzClr xAnjsAli7kFhb6QNkyJFWC6HFbRPCId7xVPCbdb4mb6cxIEtNYnSeKZlSa7ZyHHaUnjm3w SFM+dHMJqfWSEjlQWZHG8n1mjOt7tWE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D07921391B; Mon, 13 Feb 2023 13:24:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id NCGIMBw66mPrXgAAMHmgww (envelope-from ); Mon, 13 Feb 2023 13:24:44 +0000 Date: Mon, 13 Feb 2023 14:24:44 +0100 From: Michal Hocko To: huyd12@chinatelecom.cn Cc: liuq131@chinatelecom.cn, akpm@linux-foundation.org, agruenba@redhat.com, 'Christian Brauner' , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: =?utf-8?B?5Zue5aSN?= =?utf-8?Q?=3A?= [PATCH] pid: add handling of too many zombie processes Message-ID: References: <20230208094905.373-1-liuq131@chinatelecom.cn> <000e01d93c56$3a4bcb00$aee36100$@chinatelecom.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <000e01d93c56$3a4bcb00$aee36100$@chinatelecom.cn> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7C674C0002 X-Stat-Signature: qg45m56qeizwpsif6esftixep64iejw7 X-HE-Tag: 1676294686-2356 X-HE-Meta: U2FsdGVkX18mwFs3UwqzZ4NqLDYMXRJx7hJEGpR3nPBInE1p7Th2fUQgrhul25j53QgValw+xz7FkgvFSKsHbZEKIYaSxWGzOTcI6GK1ou/LHsMW3cbplwO+0Ew+Rn1V4x0vo94JZG0vdBL4gSSopTyzKJBIrQ9IxbwlmaF7/2ABohH0l3lRdkZTYQy2VzR/eyPo8v7amAZMypujOWw2bVWU6YDlnnnx03DvlYgPcNjcISVUosgAOiA07413LFPIM3Ew7G8+QJx4yuGaWm2x9XFw3n3nzWMS3+0L/eoZ3FpC/3gwtB1MuKlSnVPStssa4HAIUUwhggocqmAdtwdU+zxNLuHo+50gV7iNrd2PSl5Wbu88O+RD7adUJNgpi2mznuhewdjWwqdh5hwvJnBe18NMe9hXSyy1jWt0Lms9u+5vHijqi+eW/c9kRlGVfFZmdxarAMXABr1314YROweU6ikvuqwndMJ+4g/CFKlnsw9ZNKoqIvplbbrD7dK0n1g8Do1QKD83Kjiks4cV03OCe0MBB/7IV8x6fHoHYC13YX7JLZ6wpA+oFnn7hRA+Qe8G3N+EWjWGWP81PKFI0YUBBtbXB1oqXwTXTsCUXpLRTAmpVHnRU9JMasp4M7wxA1j+spUHSxAzbw5brx0m+ZcVbVga2Ove027W7nnhgPyUOjswtPcdrzTc1jUNtgHJzNh7r1m6HttCdHsaoQv114hITXfYgal8WxkiEbS5ebJ1ogorueTEAHcIhnSTTQwmzMvw1rkWL5uD0K50NGc1GHRJ3SP8QlxA9At/BaY+eQAXBPaVWdOfMuX+V3plaQp0pZ9GSdAkjuqJ+nUVWg6lWm3nwZYyCfcBrb6Vpk6jxbIqVdRqmczqVEOcO1AmmMR3YEGXnYxqsK/ycs/S96wDMRIWefY+Khf1z8NQXx6ryyZHv62ZhXHT273kbaSSAqFLFMCBb1KMvKj+W8C3zhwje/n 7agS0n77 PXKd6QcP6VwAT7E5ylWVYU1nPmu+VGY/KO7MxLbmsMJCLgu/Lrt8KfjvZcFKhXzEntcWwDqM0OQk5cM+9MtgoH0hkEcKhb6PLxvXO8LjPE5tf+0P/PHQ6Y11v6nSL551FQk5eT96rKQPWF9oH55ooOwQrvO/47LFFFxvHF6tI84Xdj9e1Jug28WLG6/iRV92HsFhJYQn0gSOZ6r96Fgv7tmYSd0x17DCMgoY8QY5ESuWePYbr7eHcKUSZUdRy+HFBnsJnqSd0/RiIhEYHpDJoBkCdwXn1OKhsmmOaJ/8SOtmVt6thOgg2ZGTQAPJ2vnjdRp9aTfZ2vseYeUA4JB6xR1fI1frrcSOk1soJ6XjzJfVOoVPtn3cCXlEY/RDcROrHhJEapCm78zCxFmU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.002195, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 09-02-23 15:14:57, huyd12@chinatelecom.cn wrote: > > Any comments will be appreciated. > > > > -----邮件原件----- > 发件人: liuq131@chinatelecom.cn > 发送时间: 2023年2月8日 17:49 > 收件人: akpm@linux-foundation.org > 抄送: agruenba@redhat.com; linux-mm@kvack.org; linux-kernel@vger.kernel.org; > huyd12@chinatelecom.cn; liuq > 主题: [PATCH] pid: add handling of too many zombie processes > > There is a common situation that a parent process forks many child processes > to execute tasks, but the parent process does not execute wait/waitpid when > the child process exits, resulting in a large number of child processes > becoming zombie processes. > > At this time, if the number of processes in the system out of > kernel.pid_max, the new fork syscall will fail, and the system will not be > able to execute any command at this time (unless an old process exits) > > eg: > [root@lq-workstation ~]# ls > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: Resource temporarily unavailable [root@lq-workstation ~]# > reboot > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: retry: Resource temporarily unavailable > -bash: fork: Resource temporarily unavailable > > I dealt with this situation in the alloc_pid function, and found a process > with the most zombie subprocesses, and more than 10(or other reasonable > values?) zombie subprocesses, so I tried to kill this process to release the > pid resources. Abusing oom_kill_process is not the right approach. Also any hard coded limit fir the number of zombies can turn out to be really tricky and it can cause regressions. Is there any reason you cannot contain those misbehaving workloads in a pid controller? -- Michal Hocko SUSE Labs