From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFCF6C7618E for ; Wed, 26 Apr 2023 08:33:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F1A86B00A2; Wed, 26 Apr 2023 04:33:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 37A6B6B00A4; Wed, 26 Apr 2023 04:33:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21C976B00A5; Wed, 26 Apr 2023 04:33:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0D14F6B00A2 for ; Wed, 26 Apr 2023 04:33:11 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B5F1EC0235 for ; Wed, 26 Apr 2023 08:33:10 +0000 (UTC) X-FDA: 80722877340.16.48E7551 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf18.hostedemail.com (Postfix) with ESMTP id C5DB81C0003 for ; Wed, 26 Apr 2023 08:33:07 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=BZFGL0qx; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682497988; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Xraj4O8P/AnU/F90HJjQkMxkb4dwN3DlwB4heo1VSHQ=; b=PG0exkQniAnk1XFq9ASk4twjAi3yA3b8IpBaA5R1eWdTQ1BvcsjtQrcnghkDYfDkXLLJR7 BfV1fI/uWvzCPVB90qzAV45TVB0X1xnQWPX8JdMsr3guEGwBsWAuednAHe6j86XAv6CASe beGA3N7AmPZHsIW92ihDK8EEd1fPNHU= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=BZFGL0qx; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682497988; a=rsa-sha256; cv=none; b=7c0e3280HnDp6/eahsOhyTLMLcbXwQxJnVR0XC+tgC66G2+WAkzbjmyujx6dqv4klAk1m0 mfaGCt6ajPEyqeMFAYWs7XUp7nk9Ua8xVkFZ7kg4mO4R2auh7wmAzHROIO91NW59iy5E2y vUzXcyosfKIxHQE3vCTWGJGax6Ype90= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id F37B921A18; Wed, 26 Apr 2023 08:33:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1682497986; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Xraj4O8P/AnU/F90HJjQkMxkb4dwN3DlwB4heo1VSHQ=; b=BZFGL0qxxwiYezNRwquFPnen7HaTRZrD5UKdX2+cxzj3npdyl0ZjlC3RZ0djXj5lEKK+73 a7UC3ZURRmcPEPsn2x8DQQxW2SpTCqduRF7JknL8AxyImITP5MZEYPDm/uyq+OLB87edHT Xdj99vedBs0rFD06V/qy6n4Vv5sNiT4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E40D313421; Wed, 26 Apr 2023 08:33:05 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id IDQEN8HhSGTiGwAAMHmgww (envelope-from ); Wed, 26 Apr 2023 08:33:05 +0000 Date: Wed, 26 Apr 2023 10:33:05 +0200 From: Michal Hocko To: Hui Wang Cc: linux-mm@kvack.org, akpm@linux-foundation.org, surenb@google.com, colin.i.king@gmail.com, shy828301@gmail.com, hannes@cmpxchg.org, vbabka@suse.cz, hch@infradead.org, mgorman@suse.de, dan.carpenter@oracle.com, Phillip Lougher Subject: Re: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS Message-ID: References: <20230426051030.112007-1-hui.wang@canonical.com> <20230426051030.112007-2-hui.wang@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230426051030.112007-2-hui.wang@canonical.com> X-Rspamd-Queue-Id: C5DB81C0003 X-Stat-Signature: 9rqyqwnn9k7uwogpyjtk6nhxzehgijez X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682497987-5788 X-HE-Meta: U2FsdGVkX1+6s0htobmFQAh/A7xuQ7AEjU3+1rZ/rafIstW3oaB/vuntNxBAkpRorFYIWFXiDeZ3/6vfCHQAhKClH+aukTIHiavwKIx3UxcoH0w6JeWTlr3TM3HyR16RkRq8SZSwPXSd/jDH2sHsZnnjizJQgbKs5FPR8qvm0xgJRSsyzESLvLh93CSxL0i75iIzqTX2YR4jskqGqA68AfISo6vFbZ/Inr7XLZDV/6wI6jxb9eKu2XHM+JMka++bSHDb+d7P3fLQ15vmJ6g1bpGSCHtqNG9bTbpDjBTQ0XtXPOEszRFOLC59xpKPaFWul1Qv8XQqqTTTmAswRwbTVZ7ChElCddIjppRd2896FAJtW09jcUiUkYZQ1fYOeG6JXZrtwwbeP8sMIxBjWR1JsDEMbiZ5m//0Wym21fuPXJzP4Tuu5CIMZzCbrZphh6Oo91jZAwJv9qCdTeWF0ki3WEmr9hb0s6oidghYHVFhYyP5b7De4KZ6D5vapbvJ51XYE7WdfP6LVQhdOIdBEQrPcE7fgwzpEdAR302IgWDa7jLswYO4lRzOuf1WbB5Pojyx+urSFmnnwaFJdCU/Szzil4/vu1afXZExi9aQgP/yIWJWPb87zbHVr6JE0QHO9LNhVbSSjxcdvs40sOrXBZyOV73fl8h7kGM4xADFS4LlZ0lwvoz1p4ZOaarrApXUWtr52YTB2DOrIXWcdu1MjEPtwxN+Yk7dgeXohuldZ0GobvWg0N5XfmjXyWSWvic6jfzH0Jrz6ML0KBNrfszUu/dQjofqzTxYGXygJSiwavWRAdQNeivjHBHngyaCR6X0SoAT4UAGg8zGMohpJeTgZDQTisUEnn4NRMTCUg9ugv50WP0QO4m9nip49CAndjIZ/Mh3XpPgXrWm4M51p2uz7nRaDR8VIQXtTpZn/hw8Vz2cYt3O78vGoTMFWSOnNnb8k03BQTjsXXvBVgopj5ptp3I f+dydzmr /GTrVc8YJ7tVrj4ITuevsNQZM45s+a56Y8dGBZ/1PpwZjnHjdZjop3730JaWvL/DrVhZVWIR3rXV41JfJABiOru1NfgZEve9bZiHYjceVROIfKjPgO75Drf+68jxaujNvJ6oQ7PVd0WIfA/RZwrVNOCF5/qZGCjDht0GXLDPVGvLrc/xDhlRPBqZr0ZLWEAD2MrPPcg607hbphzJ/lYbJJdEjPfcQRSCKaaAJ/FlZSwCnkDyAnJ3Ui9BZwKTV6HiqsyrX5iFR9QbuPsxD2+ELvES2MEHAaeahFyocgAEa70pdGfzpXd6S6i3L3ORnyeZ8SEOFqnTxEhyaA1qSDYGBBO5UjirhLMTzmrsqCFgKOOThM7yFuKMTDVcDIFkBoOpC8Dcl/5nGAM9jWOQQyFw7XLkKzTx3d1B+MO2YyJprETJ187opDHzHg2bFWuczKzpOUzJOF7BBVWpZI8V4LScXH1X+rpdrQUi3jnSsXGgZIzmrBNdaosUu3a82jsaSTxfZjPf9ZUyptcCcxW0X3seIvg9eKRrPJXWVEK3RcLDZ/DGxYXbGjgSK3pTc3f6A18jIrYkFamWkm20R1g3Uk5vaETfHbRl6JWsL3Vv5SyFx5+x8viFtlQ3MXiPkEXiENXKtF0D1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [CC squashfs maintainer] On Wed 26-04-23 13:10:30, Hui Wang wrote: > If we run the stress-ng in the filesystem of squashfs, the system > will be in a state something like hang, the stress-ng couldn't > finish running and the console couldn't react to users' input. > > This issue happens on all arm/arm64 platforms we are working on, > through debugging, we found this issue is introduced by oom handling > in the kernel. > > The fs->readahead() is called between memalloc_nofs_save() and > memalloc_nofs_restore(), and the squashfs_readahead() calls > alloc_page(), in this case, if there is no memory left, the > out_of_memory() will be called without __GFP_FS, then the oom killer > will not be triggered and this process will loop endlessly and wait > for others to trigger oom killer to release some memory. But for a > system with the whole root filesystem constructed by squashfs, > nearly all userspace processes will call out_of_memory() without > __GFP_FS, so we will see that the system enters a state something like > hang when running stress-ng. > > To fix it, we could trigger a kthread to call page_alloc() with > __GFP_FS before returning from out_of_memory() due to without > __GFP_FS. I do not think this is an appropriate way to deal with this issue. Does it even make sense to trigger OOM killer for something like readahead? Would it be more mindful to fail the allocation instead? That being said should allocations from squashfs_readahead use __GFP_RETRY_MAYFAIL instead? > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Suren Baghdasaryan > Cc: Colin Ian King > Cc: Yang Shi > Cc: Johannes Weiner > Cc: Vlastimil Babka > Cc: Christoph Hellwig > Cc: Mel Gorman > Cc: Dan Carpenter > Signed-off-by: Hui Wang > --- > mm/oom_kill.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 044e1eed720e..c9c38d6b8580 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -1094,6 +1094,24 @@ int unregister_oom_notifier(struct notifier_block *nb) > } > EXPORT_SYMBOL_GPL(unregister_oom_notifier); > > +/* > + * If an oom occurs without the __GFP_FS flag in the gfp_mask, the oom killer > + * will not be triggered. In this case, we could call schedule_work to run > + * trigger_oom_killer_work() to trigger an oom forcibly with __GFP_FS flag, > + * this could make the oom killer run with a fair chance. > + */ > +static void trigger_oom_killer_work(struct work_struct *work) > +{ > + struct page *tmp_page; > + > + /* This could trigger an oom forcibly with a chance */ > + tmp_page = alloc_page(GFP_KERNEL); > + if (tmp_page) > + __free_page(tmp_page); > +} > + > +static DECLARE_WORK(oom_trigger_work, trigger_oom_killer_work); > + > /** > * out_of_memory - kill the "best" process when we run out of memory > * @oc: pointer to struct oom_control > @@ -1135,8 +1153,10 @@ bool out_of_memory(struct oom_control *oc) > * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() has to > * invoke the OOM killer even if it is a GFP_NOFS allocation. > */ > - if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) > + if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) { > + schedule_work(&oom_trigger_work); > return true; > + } > > /* > * Check if there were limitations on the allocation (only relevant for > -- > 2.34.1 -- Michal Hocko SUSE Labs