From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA5C1C7618E for ; Wed, 26 Apr 2023 05:10:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F11B76B0072; Wed, 26 Apr 2023 01:10:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC1406B0074; Wed, 26 Apr 2023 01:10:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D89316B0075; Wed, 26 Apr 2023 01:10:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CC1C76B0072 for ; Wed, 26 Apr 2023 01:10:50 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8FF38160127 for ; Wed, 26 Apr 2023 05:10:50 +0000 (UTC) X-FDA: 80722367460.26.B31D7D0 Received: from smtp-relay-canonical-1.canonical.com (smtp-relay-canonical-1.canonical.com [185.125.188.121]) by imf09.hostedemail.com (Postfix) with ESMTP id CF6A3140003 for ; Wed, 26 Apr 2023 05:10:48 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=canonical.com header.s=20210705 header.b=lG2KJim5; spf=pass (imf09.hostedemail.com: domain of hui.wang@canonical.com designates 185.125.188.121 as permitted sender) smtp.mailfrom=hui.wang@canonical.com; dmarc=pass (policy=none) header.from=canonical.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682485849; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cy7ZS+3xcP4PlysttRJszIvv+UbdACt/mWptX1vI5eQ=; b=0xDC6Aq30X2/Y6REDWPLobVzQW5v4f8VsHPIyT4MJhLDXdXOOiHb13tGIwKXb+T+R1QTnC Ne9eQXWTzPY1RYFnVj1OGHEct4RyZdrq6KGNDF35EtkEW5cOIXwB+l01luUJ4EBaiNhOMr fPHaZ9bEZnfv77lWdljRO2QtAAm+w6A= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=canonical.com header.s=20210705 header.b=lG2KJim5; spf=pass (imf09.hostedemail.com: domain of hui.wang@canonical.com designates 185.125.188.121 as permitted sender) smtp.mailfrom=hui.wang@canonical.com; dmarc=pass (policy=none) header.from=canonical.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682485849; a=rsa-sha256; cv=none; b=SkFEFhqLH7PIlMjdr0KjnVgnDqWFFYnVxCjjqXKInwd2MobNXSOE6vLnvzDjs0epy+8P4f 7XK4XRFiQD5SIKt6q91EcELze1fUkq+9yNC0jd7L42RtTWZ9WrfPari0w2P8uih5adddhb 5Yzm67ayrbqjbaqttkoe4zUJ0JntpG0= Received: from localhost.localdomain (unknown [123.112.66.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-canonical-1.canonical.com (Postfix) with ESMTPSA id 45CE042BC4; Wed, 26 Apr 2023 05:10:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1682485844; bh=cy7ZS+3xcP4PlysttRJszIvv+UbdACt/mWptX1vI5eQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lG2KJim54duutVQG0BEzcjSk9V3UZtrl1cKyQ2OjHYX6SFkzOBAODqaJPg8PrgvNN CtGL9jUBvmH2eNsi+Y79lsf9beSGZu6Wqbqa1PepFC9vt2INfCUIjyj/X+ePk8JbcO KsGywmJWnDAYYDST3f0KjlgveH3E8cVTLxXj9qP0Ri9NU2VDAefBq+R1h9nRP6+j1V ssuAU+1T5ro9afgTBRY3KHX9qhJ+pbXVWyq8RNFckTbOIcGh72k3vdOOzxYOpSY5jW GbxhLZtHKIFI963dcSNt67ATgVnewrZpp9QNIq4BMZ0K1wztW7OM+n2V/EM+jfTUtP JbKFBq8ctEepQ== From: Hui Wang To: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com Cc: surenb@google.com, colin.i.king@gmail.com, shy828301@gmail.com, hannes@cmpxchg.org, vbabka@suse.cz, hch@infradead.org, mgorman@suse.de, dan.carpenter@oracle.com, hui.wang@canonical.com Subject: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS Date: Wed, 26 Apr 2023 13:10:30 +0800 Message-Id: <20230426051030.112007-2-hui.wang@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230426051030.112007-1-hui.wang@canonical.com> References: <20230426051030.112007-1-hui.wang@canonical.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: CF6A3140003 X-Stat-Signature: p9in6dnahw4uhfznwfxyxx3exbrrwon5 X-Rspam-User: X-HE-Tag: 1682485848-319036 X-HE-Meta: U2FsdGVkX1/KuYGhm98eMO59fMVPClOMO5XvXtfTcVMjfObqqGLB/S1lquvZAEEprFFSpcrC5gm7u3Bcs/NzTIpUXKpymssYi8zUw35x67ZsPs/dpYidsKzdHQgjuSJPWAIGpKbuvc5fgndQWn0ZG2luzBM3Md0cUszfoUBxZP84UFX36TLGQJeEb6WbIbdA+p38Id7modtUD+AILYe2w8FYiUGovixHI9+SwANlw8rcbIzl+UvZ9wobbiRKtPAeKj+nwWrTSVMws7Gh7RaDar7G93wmX+LdGZI8JX8jZ3Ko0pNRcuNqDC5xDIwny7hIFo3LJnGsoIHTcqrfVWK6dZNrP5HWF68ow0t+3dUN/4vSz2nfPafhZVra7W0xzHXv64JLw6Ux2tmCanSi/6piWlIOYk6KTp2c3G07MAo1aXTDKHuoYBBBnIZrhMapizfxQFolD4ZzZYOa/M0zrpkSkHq+dju9ooF3Xos6lZ1bY3ks2CN79kOBZwKJ24tbc62FmnP/WWlFqcJjxetp9nGbn7c8oWhxtIMCvx+NdQjpUEjLymmd6fML0xKYhmZhaLy/OOiIHUN+KOwlI4aiZOHq5TvSTv2N/YqfS+GRz8e8QrBFbz6pDFcxlBxfynWTIQv81+IcWrNigqL9OhSj0WPAkubIj8dn+z94j3SIx/37B9A4+cuoAB2z+IupfS8kL3R8KxI3rbORJurzDhOzFM4MHFD2ZOs0xIC2fjOjVYfAR3wL+lqHP7HlJSCcynv5aQnAuAlsChBQTEDL6Ieg8O+AMtKvBQ6/oRgo5/GAg8fNjt67YUS1/0FUajQCxqkDdVc8fMRho/ZK3Tmz1al+TgJcxKfIpyV/Lg8QJZR8ThHRktNL4oZE4jNJipywNmPwc2QF8IncCrcjYeRC17jrXahkCeehoT8IoJ0UXsIvmS6qCpIK6xLg/HxvGcK6sMLrC12+wY1zoRKkLY4/w6ubpHh QmnxpsIX VhwjokfcZ2yXRXThEAztYlN9I2HUhv/uL2yhDMoVMA2Abz8JoUA/PrQ/Gy/3uQIapLUyf/U9DuIAd+K7nqHg6tPeqogVVQU9xxLE6vaPB9OpZAEeMy9ui78BQpzRXSIR4ONt53il89HmcuLuJlWQMxOGMWybtfwmGZzgRuNCHV4+1zoUkpYjd/ylAU0Xcx7OOFp5n5QZScwLbPc8lyFN6yGtySNv8wTfF8mQGegJ//7C2BKbJxlyWyMoWaSG8FTQryM0umV/cwMScwjkRAeFeVkjK9xInx5EWPYmH8DVuha8Sb+SOARBueIEN1AMxYBC6Nxm0/FSlf6FLNkHVXLr7g7DI762D/+r+MjwuReZ7kZrtPPCPMNA8qhM1deuQBAtnOxlUCkz0C6kYLDQWMXop7DcBH/s8fsbr0JpsCalY4WncNSmktBxfexy5imiFW185x7/enR2i5QdFnUlsTkoZy5+rk9mHvIUBlITlDfPp+UiRs7XfUBFYALa2b2ZwB5f6We+pnzwxEtH1s48VUCq+0KLBPBTWStzyYzLXK+u9ORPkpSq/0f+KRuZYldas6CbCqU0wo2UtiiwHPoDu9oVldv6+0CPl6rC5kYm5RJmxGg05DauVHNdW9zoaC9iIVKp8QZg3h8zhA+MYi034Nv406SS2xrlgjJU1NILx7g8PKZ1Lp2s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If we run the stress-ng in the filesystem of squashfs, the system will be in a state something like hang, the stress-ng couldn't finish running and the console couldn't react to users' input. This issue happens on all arm/arm64 platforms we are working on, through debugging, we found this issue is introduced by oom handling in the kernel. The fs->readahead() is called between memalloc_nofs_save() and memalloc_nofs_restore(), and the squashfs_readahead() calls alloc_page(), in this case, if there is no memory left, the out_of_memory() will be called without __GFP_FS, then the oom killer will not be triggered and this process will loop endlessly and wait for others to trigger oom killer to release some memory. But for a system with the whole root filesystem constructed by squashfs, nearly all userspace processes will call out_of_memory() without __GFP_FS, so we will see that the system enters a state something like hang when running stress-ng. To fix it, we could trigger a kthread to call page_alloc() with __GFP_FS before returning from out_of_memory() due to without __GFP_FS. Cc: Andrew Morton Cc: Michal Hocko Cc: Suren Baghdasaryan Cc: Colin Ian King Cc: Yang Shi Cc: Johannes Weiner Cc: Vlastimil Babka Cc: Christoph Hellwig Cc: Mel Gorman Cc: Dan Carpenter Signed-off-by: Hui Wang --- mm/oom_kill.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 044e1eed720e..c9c38d6b8580 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1094,6 +1094,24 @@ int unregister_oom_notifier(struct notifier_block *nb) } EXPORT_SYMBOL_GPL(unregister_oom_notifier); +/* + * If an oom occurs without the __GFP_FS flag in the gfp_mask, the oom killer + * will not be triggered. In this case, we could call schedule_work to run + * trigger_oom_killer_work() to trigger an oom forcibly with __GFP_FS flag, + * this could make the oom killer run with a fair chance. + */ +static void trigger_oom_killer_work(struct work_struct *work) +{ + struct page *tmp_page; + + /* This could trigger an oom forcibly with a chance */ + tmp_page = alloc_page(GFP_KERNEL); + if (tmp_page) + __free_page(tmp_page); +} + +static DECLARE_WORK(oom_trigger_work, trigger_oom_killer_work); + /** * out_of_memory - kill the "best" process when we run out of memory * @oc: pointer to struct oom_control @@ -1135,8 +1153,10 @@ bool out_of_memory(struct oom_control *oc) * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() has to * invoke the OOM killer even if it is a GFP_NOFS allocation. */ - if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) + if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) { + schedule_work(&oom_trigger_work); return true; + } /* * Check if there were limitations on the allocation (only relevant for -- 2.34.1