From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9473AC7EE22 for ; Mon, 8 May 2023 10:05:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E374C6B0078; Mon, 8 May 2023 06:05:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBFAD6B007D; Mon, 8 May 2023 06:05:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C60846B007E; Mon, 8 May 2023 06:05:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B30B46B0078 for ; Mon, 8 May 2023 06:05:41 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 79AF71404B6 for ; Mon, 8 May 2023 10:05:41 +0000 (UTC) X-FDA: 80766656082.08.BDBD0DE Received: from smtp-relay-canonical-1.canonical.com (smtp-relay-canonical-1.canonical.com [185.125.188.121]) by imf14.hostedemail.com (Postfix) with ESMTP id 73B45100006 for ; Mon, 8 May 2023 10:05:39 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=canonical.com header.s=20210705 header.b=H8l0ReoD; spf=pass (imf14.hostedemail.com: domain of hui.wang@canonical.com designates 185.125.188.121 as permitted sender) smtp.mailfrom=hui.wang@canonical.com; dmarc=pass (policy=none) header.from=canonical.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683540339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/3ZKBuU8Vymog5vXWwNwCrATPbo5U0gvSjbdlQ3JoKw=; b=llL8pqcGJTdV3HWiEVI8Yyjt9iFlL7JBne6iMtTaCJ9Db4QS0icaHzTiBvfclWLEjxLcms 0z0wDjhlW0gxxAwkDmmV/xIRKAH/hld5t0/WAUmGqFx3QGqrN7oi0TXbgEBtk3vEpNFpl9 06o7UNWBm12cgLktIxhTzy6kK7TySQM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683540339; a=rsa-sha256; cv=none; b=ryIGfrRLUcdui5+fLlWDJuARLdaIiu06M7ULhCuqo85I6jC9rbByo+9HFF6PrUF7F8HJjD h10v5Mw+SiQ/UQ52tVcjgkAsxIFegW2AGK+pyamMcKPOqjC0qnFPPhCRIyebI+CzHvI2aw m0a1/JGT81OxiO1mEARpYK42rQ7v27s= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=canonical.com header.s=20210705 header.b=H8l0ReoD; spf=pass (imf14.hostedemail.com: domain of hui.wang@canonical.com designates 185.125.188.121 as permitted sender) smtp.mailfrom=hui.wang@canonical.com; dmarc=pass (policy=none) header.from=canonical.com Received: from [192.168.0.106] (unknown [123.112.66.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-canonical-1.canonical.com (Postfix) with ESMTPSA id 94D1C41FFB; Mon, 8 May 2023 10:05:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1683540334; bh=/3ZKBuU8Vymog5vXWwNwCrATPbo5U0gvSjbdlQ3JoKw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=H8l0ReoDY3R8ZlBl3Ravgng+0SwiwIGL5oHyRDunxp2WtJNfWLnO3Rvce6e1EhPlY 2+xAheP1BOgwWsjcgnFkKkysump4QyBbajjQA9Z8D15PN5I7QxQ3MBAF4HqnHSRyjV /nP0OTaAB7upZpsseCI4sn4blBAoGJe5kgy8FHHa64T4kcEL0DQSXwZvm+3AbR7ZDt HKnbFZhZSiJsQZUgMDOMOwB2cZeYhlQpBCo6ziUNb6W5rJAOTOaSBZQY1eVPJZYkVl ZOlsw7vxXB4gTByDGe8y6pBgFw4wxIyxQrmDZv91tiHfRq6MLIskYwbWalEdZzYock 2jOrcTETb4Mzg== Message-ID: <970ab0ac-99bd-fb90-f5a5-beadd73b8afd@canonical.com> Date: Mon, 8 May 2023 18:05:25 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS To: Phillip Lougher , Michal Hocko Cc: Gao Xiang , linux-mm@kvack.org, akpm@linux-foundation.org, surenb@google.com, colin.i.king@gmail.com, shy828301@gmail.com, hannes@cmpxchg.org, vbabka@suse.cz, hch@infradead.org, mgorman@suse.de References: <20230426051030.112007-1-hui.wang@canonical.com> <20230426051030.112007-2-hui.wang@canonical.com> <68b085fe-3347-507c-d739-0dc9b27ebe05@linux.alibaba.com> <4aa48b6a-362d-de1b-f0ff-9bb8dafbdcc7@canonical.com> <70000460-ace2-3965-084d-34be65a6bd6a@squashfs.org.uk> Content-Language: en-US From: Hui Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 73B45100006 X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: t35ajpct7imiys8m8ew7fttbscip46en X-HE-Tag: 1683540339-178360 X-HE-Meta: U2FsdGVkX18XuQAD2lzryfACNvuuzP23s4hM3/cb8ZsaK68XklaAc1juxX8Lh+YdA39l8iuGE3UyUH4jRYDn7NPcZnljakf1sE1v8kQuGxAZ9/kKSWq4DtT2QCz0DvpKHYJddcNp4nPw9TaA+W7cvOztoQLgC2LOUCECBJye7lifr2H5pZ/p5nCTYpbplf2qGUhXUju52wNSViCBda0qQEZEOK68uptabVaevkJFecedojDoiFapJ8eQg0kvBr1SgXvEm2RYCk1QWB6UVX4YWC3gVQRpXYO5Ciq2O/WtWJiEvQWsh0IOPBCKrBz6tIA0CVkbZSRlMiHYy4ReCoSm6JaqsugLnHNwZob1kOqn3q0Bqn9Yyx0RQwstLCrObWeKKCqK6TQe6QBLWJ/otomVpj9jBW5tr744Q8nyC8SX5XSGm/R0FhBnIrSoXPfnJKZgEYze0iZ9WdvpIF2KrdeSBaHHW1yl4Wd54BjE7YMYxtAEyI2uzyIdwVtLjhdUlWt4xbtIqfujJAQ3yKwRLFhrb1ZxOaqlD/EA42oMokp13RkGIDcq+SFvHt7y9JTGbS0FiJTQ+xMBBbVX4X/NOArVA6onB41A15oSPeuWbJ9HFceowBrG8aXSf91rqqFnAy0BMl1X+bVyw36NtI14uGxdL1N4ifQtujjWsZXwQADgWhekU/+m21Hkwtwzh468W32OkACpvEpir+o5mP1EkEiNbgTzNfhCRhT6vYfxuBIZHsS68vm1Ek0tqrgtNaLJfmpGpnc4xMFvkwZe+YRffP42fxaAmGlMV0O90g6/iCHqO2qaLLGogjHipQNNxFPscUrRY4RBrvCUGXDvsu2OQ39yodDp+53/nm52+P8CjRFGODVUqLBvA+Z/Zw+9lZP5Fyl6KY826ale4f1cACWzyDJD8OkbFBAi+N9481RXJ56ZJdKdpHiq7ArZJ64nNv1Y0D78DcXAKkDSq7HCRq6x7Ri PxNeVojk 3Ehl1Ndfu/4CRuxuG/IVEfT3zEXbjP/EMBflcSx19vjPCROKzRJDX84/jSLmlzQp5SC7BQX4+g9IPldrRyDPQbONGCrkqZtMMROA986ipnjFQMAsprfRZSHMvNOnhH+A1Lmhai5Zk9UmKvF8Ou/F2d9wcjp0L3zMArfnYtAMCQRFuak0dIJKZLg7yvVR8DnuB1u/ZXG8KkddhUKXc3sOAsm/xXcMK0+Ze/CM8kd+yjLMAVdeO4ZT49+MiSd8c6wbNtHyVZsF6PAjaxzDXGzt2p2jvpAfoFw45IVr5OfXkV5emNFBSefcr1x2PXdUEF0SwKfUJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/8/23 05:07, Phillip Lougher wrote: > > On 03/05/2023 20:10, Phillip Lougher wrote: >> >> On 03/05/2023 12:49, Hui Wang wrote: >>> >>> On 4/29/23 03:53, Michal Hocko wrote: >>>> On Thu 27-04-23 11:47:10, Hui Wang wrote: >>>> [...] >>>>> So Michal, >>>>> >>>>> Don't know if you read the "[PATCH 0/1] mm/oom_kill: system enters >>>>> a state >>>>> something like hang when running stress-ng", do you know why >>>>> out_of_memory() >>>>> will return immediately if there is no __GFP_FS, could we drop >>>>> these lines >>>>> directly: >>>>> >>>>>      /* >>>>>       * The OOM killer does not compensate for IO-less reclaim. >>>>>       * pagefault_out_of_memory lost its gfp context so we have to >>>>>       * make sure exclude 0 mask - all other users should have at >>>>> least >>>>>       * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() >>>>> has to >>>>>       * invoke the OOM killer even if it is a GFP_NOFS allocation. >>>>>       */ >>>>>      if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && >>>>> !is_memcg_oom(oc)) >>>>>          return true; >>>> The comment is rather hard to grasp without an intimate knowledge >>>> of the >>>> memory reclaim. The primary reason is that the allocation context >>>> without __GFP_FS (and also __GFP_IO) cannot perform a full memory >>>> reclaim because fs or the storage subsystem might be holding locks >>>> required for the memory reclaim. This means that a large amount of >>>> reclaimable memory is out of sight of the specific direct reclaim >>>> context. If we allowed oom killer to trigger we could invoke the oom >>>> killer while there is a lot of otherwise reclaimable memory. As you >>>> can >>>> imagine not something many users would appreciate as the oom kill is a >>>> very disruptive operation. In this case we rely on kswapd or other >>>> GFP_KERNEL like allocation context to make forward instead. If >>>> there is >>>> really nothing reclaimable then the oom killer would eventually hit >>>> from >>>> elsewhere. >>>> >>>> HTH >>> Hi Michal, >>> >>> Understand. Thanks for explanation. So we can't remove those 2 lines >>> of code. >>> >>> Here in my patch, letting a kthread allocate a page with GFP_KERNEL, >>> It could possibly trigger the reclaim and if nothing reclaimable, >>> trigger the oom killer. Do you think it is a safe workaround for the >>> issue we are facing currently? >>> >>> >>> And Hi Phillip, >>> >>> What is your opinion on it, do you have a direction to solve this >>> issue from filesystem? >>> >> >> The following patch creates the concept of "squashfs contexts", which >> moves all memory dynamically allocated (in a readahead/read_page >> path) into a single structure which can be allocated and deleted >> once.  It then creates a pool of these at filesystem mount time.  >> Threads entering readahead/read_page will take a context from the >> pool, and will then perform no dynamic memory allocation. >> >> The final patch-series will make this a non-default build option for >> systems that need this. >> >> Phillip >> >> > > An updated version of the patch. Hi Phillip, The patch could fix the issue. I built and tested the kernel linux-next 20230505 on my arm64 board, without your patch, the system will hang when I ran "$ stress-ng --bigheap 2 --sequential 0 --timeout 30s --skip-silent --verbose" or "stress-ng --sequential 0 --class os --timeout 30 --skip-silent --verbose". After applied your patch, I repeated bigheap testcase 60 times, the hang issue didn't happen, and I ran "--class os, --class vm and --class scheduler", all passed. Thanks, Hui. > >