From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C0CBC77B78 for ; Wed, 3 May 2023 18:42:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2F316B0072; Wed, 3 May 2023 14:42:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ADF846B0075; Wed, 3 May 2023 14:42:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CE766B0078; Wed, 3 May 2023 14:42:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from p3plwbeout26-05.prod.phx3.secureserver.net (p3plsmtp26-05-2.prod.phx3.secureserver.net [216.69.139.32]) by kanga.kvack.org (Postfix) with ESMTP id 5B3CC6B0072 for ; Wed, 3 May 2023 14:42:02 -0400 (EDT) Received: from mailex.mailcore.me ([94.136.40.142]) by :WBEOUT: with ESMTP id uHQOpAvbrlZXRuHQPpJCQE; Wed, 03 May 2023 11:42:01 -0700 X-CMAE-Analysis: v=2.4 cv=O+D8ADxW c=1 sm=1 tr=0 ts=6452aaf9 a=s1hRAmXuQnGNrIj+3lWWVA==:117 a=84ok6UeoqCVsigPHarzEiQ==:17 a=ggZhUymU-5wA:10 a=IkcTkHD0fZMA:10 a=P0xRbXHiH_UA:10 a=qotws3B3dhXpcjoxBpoA:9 a=QEXdDO2ut3YA:10 X-SECURESERVER-ACCT: phillip@squashfs.org.uk X-SID: uHQOpAvbrlZXR Received: from 82-69-79-175.dsl.in-addr.zen.co.uk ([82.69.79.175] helo=[192.168.178.87]) by smtp04.mailcore.me with esmtpa (Exim 4.94.2) (envelope-from ) id 1puHQO-00035b-T6; Wed, 03 May 2023 19:42:01 +0100 Message-ID: Date: Wed, 3 May 2023 19:41:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS Content-Language: en-GB To: Michal Hocko , Hui Wang Cc: Gao Xiang , linux-mm@kvack.org, akpm@linux-foundation.org, surenb@google.com, colin.i.king@gmail.com, shy828301@gmail.com, hannes@cmpxchg.org, vbabka@suse.cz, hch@infradead.org, mgorman@suse.de References: <20230426051030.112007-1-hui.wang@canonical.com> <20230426051030.112007-2-hui.wang@canonical.com> <68b085fe-3347-507c-d739-0dc9b27ebe05@linux.alibaba.com> <4aa48b6a-362d-de1b-f0ff-9bb8dafbdcc7@canonical.com> From: Phillip Lougher In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Mailcore-Auth: 439999529 X-Mailcore-Domain: 1394945 X-123-reg-Authenticated: phillip@squashfs.org.uk X-Originating-IP: 82.69.79.175 X-CMAE-Envelope: MS4xfDuYbxt4t8wraBn7ERKrfDib3xgJz/3IMDfKP6fUbakc3JDrSltFJFbgf12BfufJEB+7Y2FCxEPjY8H6R3M8nta6hn21Kdcw2wgVRJKXpxEiJED1xdVM qWeEOBYlWj5rmAFkySuF2bHLOw2v1x1A+b4AyoqsW9H0fgQBov5SDkm4THrE3jacqZnzeu6NN4LENstn5BxgOdTNmk1R9zk8oEE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 03/05/2023 13:20, Michal Hocko wrote: > On Wed 03-05-23 19:49:19, Hui Wang wrote: >> On 4/29/23 03:53, Michal Hocko wrote: >>> On Thu 27-04-23 11:47:10, Hui Wang wrote: >>> [...] >>>> So Michal, >>>> >>>> Don't know if you read the "[PATCH 0/1] mm/oom_kill: system enters a state >>>> something like hang when running stress-ng", do you know why out_of_memory() >>>> will return immediately if there is no __GFP_FS, could we drop these lines >>>> directly: >>>> >>>>     /* >>>>      * The OOM killer does not compensate for IO-less reclaim. >>>>      * pagefault_out_of_memory lost its gfp context so we have to >>>>      * make sure exclude 0 mask - all other users should have at least >>>>      * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() has to >>>>      * invoke the OOM killer even if it is a GFP_NOFS allocation. >>>>      */ >>>>     if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) >>>>         return true; >>> The comment is rather hard to grasp without an intimate knowledge of the >>> memory reclaim. The primary reason is that the allocation context >>> without __GFP_FS (and also __GFP_IO) cannot perform a full memory >>> reclaim because fs or the storage subsystem might be holding locks >>> required for the memory reclaim. This means that a large amount of >>> reclaimable memory is out of sight of the specific direct reclaim >>> context. If we allowed oom killer to trigger we could invoke the oom >>> killer while there is a lot of otherwise reclaimable memory. As you can >>> imagine not something many users would appreciate as the oom kill is a >>> very disruptive operation. In this case we rely on kswapd or other >>> GFP_KERNEL like allocation context to make forward instead. If there is >>> really nothing reclaimable then the oom killer would eventually hit from >>> elsewhere. >>> >>> HTH >> Hi Michal, >> >> Understand. Thanks for explanation. So we can't remove those 2 lines of >> code. >> >> Here in my patch, letting a kthread allocate a page with GFP_KERNEL, It >> could possibly trigger the reclaim and if nothing reclaimable, trigger the >> oom killer. Do you think it is a safe workaround for the issue we are facing >> currently? > I have to say I really dislike this workaround. Allocating memory just > to release it and potentially hit the oom killer is really not very > mindful approach to the problem. It is not a reliable way either because > you depend on the WQ context which might be clogged for the very same > lack of memory. This issue simply doesn't have a simple and neat > solution unfortunately. Agree. > I would prefer if the fs could be less demanding from NOFS context if > that is possible at all. This does seem to be the best solution. Phillip