From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBAF0C28B28 for ; Wed, 12 Mar 2025 09:18:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EBC3280002; Wed, 12 Mar 2025 05:18:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 19B59280001; Wed, 12 Mar 2025 05:18:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 064B9280002; Wed, 12 Mar 2025 05:18:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DD4E2280001 for ; Wed, 12 Mar 2025 05:18:34 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 156B3A9518 for ; Wed, 12 Mar 2025 09:18:36 +0000 (UTC) X-FDA: 83212348632.08.9254998 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf02.hostedemail.com (Postfix) with ESMTP id 0B5B380005 for ; Wed, 12 Mar 2025 09:18:32 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741771114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UWfCoOjawPv/SE0utldqqQVPD7NHJDovU1HpIiItcxI=; b=8Cs4JovLooaiHnPzaJ9DDnb6DU+xizyyORFGxzAtEt+mukJj8vVVbL5Ml/OCG1L0SEfK8q sw4CPgxQUtTIXbZFuM0rdoRs/Bxh+UpqRk9GOmanftlmQITE2Z2w2Ial9YWi/yQsURpJ9F C+huc3U+H8GDOYo5eRSL3LiwmOP/SrM= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741771114; a=rsa-sha256; cv=none; b=M2bZ6wjQm1qSfV7omMANSBoRrupj68STY+UoME6sxfSm3H5/wlYMLlRKK5o1kLv8B5hL20 GZ44tCgWOpGibmED0i0SQqqHJsGZx0Cw2+SmXKshu0vxSzkGqrw+tI7t/4kiswdoK3Btm2 mMgCQfjA3HHGyWRkHMHVW24ZWSbeRco= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4ZCQ6v1szCz27gBp; Wed, 12 Mar 2025 17:19:03 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id 466761A0190; Wed, 12 Mar 2025 17:18:28 +0800 (CST) Received: from [10.174.179.13] (10.174.179.13) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 12 Mar 2025 17:18:27 +0800 Message-ID: <20a6b1c1-389e-b57a-7a5c-d1b0a7185412@huawei.com> Date: Wed, 12 Mar 2025 17:18:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released From: Jinjiang Tu To: Peter Xu CC: jimsiak , , , , , , References: <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> <20250307072133.3522652-1-tujinjiang@huawei.com> <46ac83f7-d3e0-b667-7352-d853938c9fc9@huawei.com> In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.13] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemo200002.china.huawei.com (7.202.195.209) X-Stat-Signature: 6io1t6g1surzkjzr9qqdgx5j89ecer7i X-Rspamd-Queue-Id: 0B5B380005 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1741771112-77599 X-HE-Meta: U2FsdGVkX19m9qBh6V8+hKD8vjQ1Ho7qspFtErgF1f292p084EKQDkWyg/m47Wrs9r80/gFVoURPu4WcDiNX1xrSsgjrSTobPxWmpsGXi/FBHYeN8gxDOsFVRFWT1B6lo0IwULx37LeAcXy4umBtDffhSPg00VKmlvFucXF7xccJBXp53pb1veq3kHdggCQYRz8KJOgkhSZx/14QXjSjazQsG1HPjlZzhGk2fV3Q0cU4xKOqsrYQrFeIO3mQCiRBrYyr5cciBgqoBcJ6krEz1GjCC+/UyZCfVVNlCp1AwD8gBlKP5ONJLwn279dq77MxHgZkBYNUOZVWMZ/InfwbHoUHCQA7UVkAWqY766XPfiQ2EKs8giz+QHcrQsC6OIbcpM97YfyqttMzV1XSOUGg+rKzczHg63nFG74W2b3tUUnyA+o2lJoOue/Jg4BkP4gQwiQ7Yl7ayjb9cT0RX002g9v4DhpxTUhm6hAZFaTN1vMWbNFKUVGP5AaVjkluOP8DcKDk7eeUS24zfp+q4FmN5TzFw9Ls1lW0xZgzLnZg2fHkrHFyI79cNnbKcTqmqDbgJOU31JoHwuCc0d3LieOi849YT6fC/ZBxpDJb8kG/A6JSaGGOWiWPezZVGKfalRvEAsjKplKgX7dHm+1OXJkEjle62CVoaidWZw6/0e8qPCiCwX9ZDUaKbV06NvocyCsQBshApcQ6mxPi70BSI+JIe7eY0G9oc+QgUxlnEpYByUFpTMJBWSGnUkFk047O+A6tKE3oSRC+h/h7MS9N+eDHnE2mMrtEJYU12259WJizEjfzUT7NETFT/luu8ZPuok/zrI6OZGbKOiWIEFsnZvojPnmDaekryPf2aVKjDGFvRHXUI9JTQFZD1x2fsLTJQ8wOFgLoR27n5kapPEZIJw9HCIhLyRp6kIIMkOEPMZvqPxDGTYQN0UCd+U5TTHh991XOwX81fFFXXS7chWds23D 7hZCIv6X /jgru4e+wU4I3DwldK3Fjri7CIQQsd+UETwFG2hoLcRD7Ie/fpcNchlf8n3IvbcbXX6ELURHojfhGRIg8tOB9L7JV8FmAmnX8xgMlwjVLosM0zWNqiOQ2nQl3IOvkE9zxd5DJg1tzw+M3AxAmqeKNQYJkw1BycV2rtyeoUhatp0l8poSCWUsTKwQ9w5cHGr/nqfHaWQ0mq09iJjextHXRRYykRs1PDINene0zhsAnCt4D8ZcZlf0XzzJ2tQqnwyzsusYFB3IueFDlIQbEkZ8hB9Ngz9vPx1RMQWVFUr0hvLJLZEBIdLNOEnJjzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/3/11 16:14, Jinjiang Tu 写道: > > 在 2025/3/11 2:50, Peter Xu 写道: >> On Mon, Mar 10, 2025 at 02:40:35PM +0800, Jinjiang Tu wrote: >>> 在 2025/3/8 6:41, Peter Xu 写道: >>>> On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: >>>>> Hi, >>>>> >>>>>   From my side, I managed to avoid the freezing of processes with the >>>>> following change in function userfaultfd_release() in file >>>>> fs/userfaultfd.c >>>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): >>>>> >>>>> >>>>> I moved the following command from line 851: >>>>> WRITE_ONCE(ctx->released, true); >>>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) >>>>> >>>>> to line 905, that is exactly before the functions returns 0. >>>>> >>>>> That simple workaround worked for my use case but I am far from >>>>> sure that is >>>>> a correct/sufficient fix for the problem at hand. >>>> Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. >>>> >>>> Maybe it's possible to remove ctx->released but only rely on the >>>> mmap write >>>> lock.  However that'll need some closer look and more thoughts. >>>> >>>> To me, the more straightforward way to fix it is to use the patch I >>>> mentioned in the other email: >>>> >>>> https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ >>>> >>>> Or does it mean it didn't work at all? >>> This patch works for me. mlock() syscall calls GUP with >>> FOLL_UNLOCKABLE and >>> allows to release mmap lock and retry. >>> >>> But other GUP call without FOLL_UNLOCKABLE will return VM_FAULT_SIGBUS, >>> is it a regression for the below commit? >> Do you have an explicit reproducer / use case of such? >> >> AFAIU, below commit should only change it from SIGBUS to NOPAGE when >> "released" is set.  I don't see how it can regress on !FOLL_UNLOCKABLE. >> >> Thanks, > > You are right, the below commit seems to only care about page fault > from userspace (which has > FAULT_FLAG_ALLOW_RETRY flag), and doesn't care about GUP from drivers > (which may be !FOLL_UNLOCKABLE) > > Thanks. Hi Peter, Since this patch works, could you please send a formal patch to maillist? Thanks. > >>> commit 656710a60e3693911bee3a355d2f2bbae3faba33 >>> Author: Andrea Arcangeli >>> Date:   Fri Sep 8 16:12:42 2017 -0700 >>> >>>      userfaultfd: non-cooperative: closing the uffd without >>> triggering SIGBUS >>> >>>> Thanks, >>>>