From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ECF2C6FD1F for ; Tue, 14 Mar 2023 04:11:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43AF16B0072; Tue, 14 Mar 2023 00:11:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EB0C6B0074; Tue, 14 Mar 2023 00:11:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D98B6B0075; Tue, 14 Mar 2023 00:11:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1E0226B0072 for ; Tue, 14 Mar 2023 00:11:10 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E20F3AB621 for ; Tue, 14 Mar 2023 04:11:09 +0000 (UTC) X-FDA: 80566178658.25.4FD3829 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) by imf12.hostedemail.com (Postfix) with ESMTP id 91E184000F for ; Tue, 14 Mar 2023 04:11:06 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678767068; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fzEz3SLp47IreKWAfLLKYD4VaSEmkSc29uqjUOoyjrM=; b=oDIbzU+R+OYF9ooGHZcz87P7LP8UWOGtq0+2PyHaf1G1Jy2m44N5PHh79oIpBx2pR6wqBf vgX4LIbFfv1QiIwjJxYZOk5e2Tsm3aI5k0KYGOpH0hq8ypCAIgNTPkjSlkUZKGS/YVQ9Bq 273CnWQUyStAieiE59zLi0/Bn9/5/Qo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678767068; a=rsa-sha256; cv=none; b=1JNAhxmcuT0ywN2c0I73H03UH1EfyY7WIpGPOSn/G1G7KadvHhXuBAoWR9R4oZOLs5PEGo OmUbUcCQ7MILUry7k3iYn7OAKDqcxh06FCvswn/Slu4hyIajG/mrA2RQNkIW1YulJP2pcZ 9v5gcONro//RYBbgHEEGPyh7mxxbyNc= Received: from 30.221.133.100(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0VdqYDKL_1678767061) by smtp.aliyun-inc.com; Tue, 14 Mar 2023 12:11:02 +0800 Message-ID: Date: Tue, 14 Mar 2023 12:11:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH 2/2] mm: compaction: fix the possible deadlock when isolating hugetlb pages To: Mike Kravetz Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, william.lam@bytedance.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1bc1c955b03603c4e14f56dfbbef9f637f18dbbd.1678703534.git.baolin.wang@linux.alibaba.com> <20230313170838.GA3044@monkey> From: Baolin Wang In-Reply-To: <20230313170838.GA3044@monkey> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 91E184000F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: wcgnswf1bco6qr8ipwmc53d8d7ggc5ou X-HE-Tag: 1678767066-682084 X-HE-Meta: U2FsdGVkX1+BGGAAbzsGwWuTgy2FojXx+7Q2bwpjv9KNZwNPnQHnIqDiQ8fZQ6cjw+jW/ppvvCNp742DRYTunnUx8p3fXgJYDf9Raws3v1jNMj8wvesmqlWmUrK3qGbhz3sCbfFYDHTc1/4a+LdjfM74jQbfHHp1baDHzfzfw+l3/qqpF4LIzXEylMBWeAUDxf1ewxsfxEJZeYz6styHeGfbzj1qDtab/e5o/+94gse3GZ2gUyk65W+8r7QJY2fMQfw3XWlclv86I8A7Mce/J3znCKzE0d5bWtZOFTFyqj59c6FjIZsYAs5fgNHyn6NZQTQbwLzHt7bsjJ6M1Xr189KTYfLaZvCBtFQ6M+08hAjgrB2eZJmrj6GntMW1yJjy10EMGN/jBaKd4FblS5bWzDR6QyLBiSpIXle2q37Skx7KPoJIw5If7gqz6PPCR4Ndun4FgXKxoFQEOcCdKWtQFQWOjxkMNi8SW5kF9UopC/i+mxDQntpFxT74O2vpBNakCVuj42J1Yv2M2R+dZxjCgM4EED+9Gk/zPY18CaiFgxR3g08aNuOMYI3zVtAu/HRwRiNVP5qbz2iuNDBjpcrap3dE6vY7lsCDhXEs3tr3Vh56th9cAZWd6OjP+yn8TZ7IpK9DRAnmDn1wxtk6M34VsCs+uyirD8MpP/dMOuewOiF9arzs7FmQ36p8E4DqH1UO0iSRiJtRWPgYwGND2bbCtspOOYjZ3uLmPhblsegZ7BhS+Pkj3d/CZocizfPYTJXD//oXRUlMw0sS/r5GyeG35nX4JMZFPOvYSKko4ONaPaf7wHRfHD2NaqRvWqC6h955YINxELxE6eWkMeOfeCIvEAGgykiQl/9c+dvj6m7CtXNtKUtLoC/CQ6iXuygRQC4ASQGfwF8QC0E1PfUfOpUGzf7oueGxFXUoaXnC6+GrQGDSuZgiWdos84zK91f9dxPBce5f5hfGxYqnEJnRKj4 yemPm/XB zVhOS2XjWNwsKnTwzfgT8QxyFYjGtYwCJum/G/WzuWGHkqaKi6/KdSqDB7zf85Pa3YUcfABmMpHuDTkdX8iDDE7jmHGC6rTKraqs5vNyYKtVdH/RuMPfVQMb4gw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/14/2023 1:08 AM, Mike Kravetz wrote: > On 03/13/23 18:37, Baolin Wang wrote: >> When trying to isolate a migratable pageblock, it can contain several >> normal pages or several hugetlb pages (e.g. CONT-PTE 64K hugetlb on arm64) >> in a pageblock. That means we may hold the lru lock of a normal page to >> continue to isolate the next hugetlb page by isolate_or_dissolve_huge_page() >> in the same migratable pageblock. >> >> However in the isolate_or_dissolve_huge_page(), it may allocate a new hugetlb >> page and dissolve the old one by alloc_and_dissolve_hugetlb_folio() if the >> hugetlb's refcount is zero. That means we can still enter the direct compaction >> path to allocate a new hugetlb page under the current lru lock, which >> may cause possible deadlock. >> >> To avoid this possible deadlock, we should release the lru lock when trying >> to isolate a hugetbl page. Moreover it does not make sense to take the lru >> lock to isolate a hugetlb, which is not in the lru list. >> >> Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") >> Signed-off-by: Baolin Wang >> --- >> mm/compaction.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index c9d9ad958e2a..ac8ff152421a 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c > > Thanks! > > I suspect holding the lru lock when calling isolate_or_dissolve_huge_page was > not considered. However, I wonder if this can really happen in practice? > > Before the code below, there is this: > > /* > * Periodically drop the lock (if held) regardless of its > * contention, to give chance to IRQs. Abort completely if > * a fatal signal is pending. > */ > if (!(low_pfn % COMPACT_CLUSTER_MAX)) { > if (locked) { > unlock_page_lruvec_irqrestore(locked, flags); > locked = NULL; > } > ... > } > > It would seem that the pfn of a hugetlb page would always be a multiple of > COMPACT_CLUSTER_MAX so we would drop the lock. However, I am not sure if > that is ALWAYS true and would prefer something like the code you suggested. Well, this is not always true, suppose the CONT-PTE hugetlb on ARM arch, which contains 16 contiguous normal pages. > Did you actually see this deadlock in practice? I did not see this issue in practice until now, but I think it can be triggered from code inspection if trying to isolate a CONT-PTE hugetlb.