From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C2C4C77B7F for ; Wed, 3 May 2023 12:20:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 83B666B0071; Wed, 3 May 2023 08:20:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EB676B0072; Wed, 3 May 2023 08:20:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DA2F6B0074; Wed, 3 May 2023 08:20:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by kanga.kvack.org (Postfix) with ESMTP id 48C9D6B0071 for ; Wed, 3 May 2023 08:20:33 -0400 (EDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5D10722762; Wed, 3 May 2023 12:20:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1683116432; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=//+SvWi/KCCCuzUG9/PukEDNDtgu81e/feDbt3W6BpI=; b=FGht+cYnFDV3qWtdx5NDQN0VDaLnNcFgvRbfNM/nCUqPuesg/v6hve285lG38bZr4WnwHt fz72yaexVFRSX2EraUAb5JGLxS7vPR08bSwjsciMaNrHMZ3kGmxv7SRD2nnJAlEeQldcTw DacsWdXpVncO8al5Owu8Em+wnezOqC8= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3F5941331F; Wed, 3 May 2023 12:20:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id BMaGDJBRUmSKeAAAMHmgww (envelope-from ); Wed, 03 May 2023 12:20:32 +0000 Date: Wed, 3 May 2023 14:20:31 +0200 From: Michal Hocko To: Hui Wang Cc: Gao Xiang , linux-mm@kvack.org, akpm@linux-foundation.org, surenb@google.com, colin.i.king@gmail.com, shy828301@gmail.com, hannes@cmpxchg.org, vbabka@suse.cz, hch@infradead.org, mgorman@suse.de, Phillip Lougher Subject: Re: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS Message-ID: References: <20230426051030.112007-1-hui.wang@canonical.com> <20230426051030.112007-2-hui.wang@canonical.com> <68b085fe-3347-507c-d739-0dc9b27ebe05@linux.alibaba.com> <4aa48b6a-362d-de1b-f0ff-9bb8dafbdcc7@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 03-05-23 19:49:19, Hui Wang wrote: > > On 4/29/23 03:53, Michal Hocko wrote: > > On Thu 27-04-23 11:47:10, Hui Wang wrote: > > [...] > > > So Michal, > > > > > > Don't know if you read the "[PATCH 0/1] mm/oom_kill: system enters a state > > > something like hang when running stress-ng", do you know why out_of_memory() > > > will return immediately if there is no __GFP_FS, could we drop these lines > > > directly: > > > > > >     /* > > >      * The OOM killer does not compensate for IO-less reclaim. > > >      * pagefault_out_of_memory lost its gfp context so we have to > > >      * make sure exclude 0 mask - all other users should have at least > > >      * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() has to > > >      * invoke the OOM killer even if it is a GFP_NOFS allocation. > > >      */ > > >     if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) > > >         return true; > > The comment is rather hard to grasp without an intimate knowledge of the > > memory reclaim. The primary reason is that the allocation context > > without __GFP_FS (and also __GFP_IO) cannot perform a full memory > > reclaim because fs or the storage subsystem might be holding locks > > required for the memory reclaim. This means that a large amount of > > reclaimable memory is out of sight of the specific direct reclaim > > context. If we allowed oom killer to trigger we could invoke the oom > > killer while there is a lot of otherwise reclaimable memory. As you can > > imagine not something many users would appreciate as the oom kill is a > > very disruptive operation. In this case we rely on kswapd or other > > GFP_KERNEL like allocation context to make forward instead. If there is > > really nothing reclaimable then the oom killer would eventually hit from > > elsewhere. > > > > HTH > Hi Michal, > > Understand. Thanks for explanation. So we can't remove those 2 lines of > code. > > Here in my patch, letting a kthread allocate a page with GFP_KERNEL, It > could possibly trigger the reclaim and if nothing reclaimable, trigger the > oom killer. Do you think it is a safe workaround for the issue we are facing > currently? I have to say I really dislike this workaround. Allocating memory just to release it and potentially hit the oom killer is really not very mindful approach to the problem. It is not a reliable way either because you depend on the WQ context which might be clogged for the very same lack of memory. This issue simply doesn't have a simple and neat solution unfortunately. I would prefer if the fs could be less demanding from NOFS context if that is possible at all. -- Michal Hocko SUSE Labs