From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1707ACD11C2 for ; Thu, 11 Apr 2024 02:26:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C2316B0083; Wed, 10 Apr 2024 22:26:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 770366B0085; Wed, 10 Apr 2024 22:26:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 637846B0087; Wed, 10 Apr 2024 22:26:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 463C66B0083 for ; Wed, 10 Apr 2024 22:26:52 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A767C1C0E4C for ; Thu, 11 Apr 2024 02:26:51 +0000 (UTC) X-FDA: 81995663022.13.0F5E3DE Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf22.hostedemail.com (Postfix) with ESMTP id 3DB24C0007 for ; Thu, 11 Apr 2024 02:26:48 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712802409; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l2ldr2Es4HV3nvdLxpT5S1zh2OUmYjlZ4Mb/WHKn5ak=; b=ZwIORyFDHuVojRNm9WczQJYv16ixiYNqJh7CZw5HatVKliIpuadRDGaiJxuzK0kB4MAh/a xVwFWH01rydmncLgHYsjzeuTE8KTmt0AV74P9iz0/7HYVb/6v1uJXdAnD4K2NVNOYCE9mS Z6qCBaI8izq3kGi1HLOHTy3GKNJYHGM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712802409; a=rsa-sha256; cv=none; b=dE/Z2D3Zv0/oBKokEVaUnzEWpDXvjjxKt+24vFidY6qi8AlN9XUSVLqcHpM+lbgzl63N5j t5+fE7TpiV/X3ZWpcc1TWXMBZ4uogq/yfjnWLHA+jRGJOEWBzGnFJ0XCYhuF3Ln1RinmyB msvnqqL4PQUGnYePeXIHTiE5pLv/XW8= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4VFNn84QNyz1ymsY; Thu, 11 Apr 2024 10:24:28 +0800 (CST) Received: from canpemm500002.china.huawei.com (unknown [7.192.104.244]) by mail.maildlp.com (Postfix) with ESMTPS id 082B21A0172; Thu, 11 Apr 2024 10:26:45 +0800 (CST) Received: from [10.173.135.154] (10.173.135.154) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Thu, 11 Apr 2024 10:26:44 +0800 Subject: Re: [PATCH] mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled To: Oscar Salvador CC: , , , References: <20240407085456.2798193-1-linmiaohe@huawei.com> <13aa38af-46a1-3894-32bd-c3eb6ef67359@huawei.com> From: Miaohe Lin Message-ID: <247259aa-9c78-d1ae-c829-aa72adc75922@huawei.com> Date: Thu, 11 Apr 2024 10:26:44 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.135.154] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500002.china.huawei.com (7.192.104.244) X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3DB24C0007 X-Stat-Signature: kzrjwfqqegqyumeoyk9hjcmapcinegpw X-HE-Tag: 1712802408-273893 X-HE-Meta: U2FsdGVkX1+qihX/1W4rC/f7MlPnAw6vWAnIp5QfQFmKNSqQ66UXh2stFUOITPDaOOWKoVmbZQVqJSdanmGy6IiRbW0uO8oxQx0TKfcVyuSjuD9zRppoQq/4RXc07WydfiVNs5J+gwZ3pgzZ6LPz8rYp/eWbHL1khXUbh0AWVjZllJ18Kb3hyGm08SenVLdk+bWTVI4BD9YFjPB0OF+h/LUpqEJhNj/9FzW3iolONPOOXGavA7cBD+N5X3wUZKbLaqpXNqvcEmWjzXDoxUcBBVXX9y9UAqomzsi0ZVTGCvPZb+pT/mg/nAYnpB1gtRuXwr0AJE4cn+21uDDf9Eu4hUp/VRmmnaZC238C9o7Tr1JCKHavKynvrcqQpKIvE/gNvAq0eyAW8w9smR/fFdWUvwu+pYnOMDAek+dUR+QPCbDCcwRyDFA6VsQMdmYrpPruxs2ncT2caAQWrKB8ItU7WvCwX8ZW6sZY9nbXIXDI/tji1WRn4S3El/HM0PyBZRyAZucMrBhuvc4D9xCJjyUE4oB9l795YiqmzULcKZAnHMBMS+cUahJgnjaa3qeB12Gi0FPgeoP7FDhKZKmyCUprAUfZoingpt0mnp8xIGuqpD4NtipATTXVEaytOEcrB9GlzpcPSfunVE0quHL6iTMh6XVnjnBkhme2wPne4PgbX8mQJn2gI9J1GnE2ii8A896caLgwXBJrzEJh3Mc948ehurKa7+5IkTsD3n5LvfKXrxi3j0F/jPwNfn4GvQm9ts/sh+qvqSNfVO36fDewcrFv0QlHU4R9pPUpOM8yTTRZ3fQrd5Zt06H1pisiDPF04uQPK7dZHSnwCYoxGC/++tea1MfPg1zB7KWup5/6tlLijn8G3ldpKtQx6arHIftNmOSPdMQXXy/ghlaExuj7y7IUXrIm3m+nqX8no+JEGQeBrGs08Ci3Le6e9uFIITnDvUEx3VVEEdkEK7BykKTVEwq iP+qYtmQ tkfzrp1lGd75ZG7ZzTABWWp+jPb2tJXpvyq5LHsVW226H0iRn9evwKj8p6XEVO37XDQKuVtXkVQxG1+r5WmxNqFI6UN8dfPktas3ajNsSN2+26UPP2NLkK6fRT6vFnr6TG2yU5Y8tYp0IWdHXq9KNX4x5HltcgOFGCpIaw3ypFjXhX0vl4KLN6iqejg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/4/10 16:52, Oscar Salvador wrote: > On Wed, Apr 10, 2024 at 03:52:14PM +0800, Miaohe Lin wrote: >> AFAICS, iff check_pages_enabled static key is enabled and in hard offline mode, >> check_new_pages() will prevent those pages from ending up in a PCP queue again >> when refilling PCP list. Because PageHWPoison pages will be taken as 'bad' pages >> and skipped when refill PCP list. > > Yes, but check_pages_enabled static key is only enabled when > either CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_VM are set, which means > that under most of the systems that protection will not take place. > > Which takes me to a problem we had in the past where we were handing > over hwpoisoned pages from PCP lists happily. > Now, with for soft-offline mode, we worked hard to stop doing that > because soft-offline is a non-disruptive operation and no one should get > killed. > hard-offline is another story, but still I think that extending the > comment to include the following would be a good idea: > > "Disabling pcp before dissolving the page was a deterministic approach > because we made sure that those pages cannot end up in any PCP list. > Draining PCP lists expels those pages to the buddy system, but nothing > guarantees that those pages do not get back to a PCP queue if we need > to refill those." This really helps. Will add it in v2. Thanks Oscar. > > Just to remind ourselves of the dangers of a non-deterministic > approach. > > > Thanks > >