From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FBB6C433EF for ; Thu, 10 Mar 2022 13:04:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDA6D8D0002; Thu, 10 Mar 2022 08:04:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D89878D0001; Thu, 10 Mar 2022 08:04:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C78028D0002; Thu, 10 Mar 2022 08:04:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id B78458D0001 for ; Thu, 10 Mar 2022 08:04:30 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 83BBB222DF for ; Thu, 10 Mar 2022 13:04:30 +0000 (UTC) X-FDA: 79228495500.04.21A284E Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf20.hostedemail.com (Postfix) with ESMTP id 5B1211C0015 for ; Thu, 10 Mar 2022 13:04:29 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4KDq062GhNz1GCDj; Thu, 10 Mar 2022 20:59:34 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Thu, 10 Mar 2022 21:04:24 +0800 Subject: Re: [PATCH RFC] mm/memory-failure.c: fix memory failure race with memory offline To: David Hildenbrand , CC: , , Andrew Morton , Oscar Salvador References: <20220226094034.23938-1-linmiaohe@huawei.com> <4307e915-ac24-58bc-23ad-7e94e2b37170@redhat.com> From: Miaohe Lin Message-ID: Date: Thu, 10 Mar 2022 21:04:23 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <4307e915-ac24-58bc-23ad-7e94e2b37170@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 5B1211C0015 X-Rspam-User: Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf20.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com X-Stat-Signature: scusszpm86f8ep74jbnto3a4ruca6x7b X-HE-Tag: 1646917469-288445 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000033, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/3/1 17:53, David Hildenbrand wrote: > On 26.02.22 10:40, Miaohe Lin wrote: >> There is a theoretical race window between memory failure and memory >> offline. Think about the below scene: >> >> CPU A CPU B >> memory_failure offline_pages >> mutex_lock(&mf_mutex); >> TestSetPageHWPoison(p) >> start_isolate_page_range >> has_unmovable_pages >> --PageHWPoison is movable >> do { >> scan_movable_pages >> do_migrate_range >> --PageHWPoison isn't migrated >> } >> test_pages_isolated >> --PageHWPoison is isolated >> remove_memory >> access page... bang >> ... > > I think the motivation for the offlining code was to not block memory > hotunplug (especially on ZONE_MOVABLE) just because there is a > HWpoisoned page. But how often does that happen? > > It's all semi-broken either way. Assume you just offlined a memory block > with a hwpoisoned page. The memmap is stale and the information about > hwpoison is lost. You can happily re-online that memory block and use > *all* memory, including previously hwpoisoned memory. Note that this > used to be different in the past, when the memmap was initialized when > adding memory, not when onlining that memory. > > > IMHO, we should stop special casing hwpoison. Either fail offlining > completely if we stumble over a hwpoisoned page, or allow offlining only > if the refcount==0 -- just as any other page. > > IIUC, there is no easy way to found out whether a hwpoinsoned page could be safely offlined. If memory_failure succeeds, page refcnt should be 1. But if failed, page refcnt is unknown. So it seems failing offlining completely if we stumble over a hwpoisoned page is most suitable way to close the race. But is this too overkill for such rare cases? Any suggestions? Many thanks!