From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C464CF8862 for ; Sat, 5 Oct 2024 12:39:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C41036B0201; Sat, 5 Oct 2024 08:39:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BF04B6B02E9; Sat, 5 Oct 2024 08:39:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A915B6B02E8; Sat, 5 Oct 2024 08:39:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 871546B0366 for ; Sat, 5 Oct 2024 08:39:13 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 151251601D8 for ; Sat, 5 Oct 2024 12:39:13 +0000 (UTC) X-FDA: 82639503786.27.75C844A Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf13.hostedemail.com (Postfix) with ESMTP id 2349020008 for ; Sat, 5 Oct 2024 12:39:10 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Kqli0Wk2; spf=pass (imf13.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.193 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728131819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YbmQWqwf8KLWBh53BpLjJjepp8M5DqLAMepSsUrr8/A=; b=TwPpbAdXZARLw7AGP6ZjWZYijwy8IH1KnfkwJzFaxWwiBlH2J05NM1m3PPlpWpzWYCNiYW fA85hqqtiMtYhh/1Ms2wtpYtV8t1aoj0LsgsuFCL/ky+mCTJFqEGkuXDYfc2oCEI3hbw+6 7ktDzByeEFtSN038HoM0TEhKMy4CuPc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728131819; a=rsa-sha256; cv=none; b=PAq9f44ElnPbQZGIiQvzzHkjySdwtSxFQbcxVADixIkyyy1mDT1tJQnqfIvMSye0FdxPBE 1lS0zZ3GFi350Z+L3jbC933m/37YKWzQ2Zf8uWrkdAT+3hvGoNwJYsaz91WlI/Q6AdU/rK E4SZVVzCHUBF4RtLmyb3zh1Q6Tf/8DI= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Kqli0Wk2; spf=pass (imf13.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.210.193 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pf1-f193.google.com with SMTP id d2e1a72fcca58-71dbdb7afe7so2446853b3a.0 for ; Sat, 05 Oct 2024 05:39:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728131950; x=1728736750; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=YbmQWqwf8KLWBh53BpLjJjepp8M5DqLAMepSsUrr8/A=; b=Kqli0Wk2TYoRvPAV9sus8Vz488M8PGZlkonSrSrgUi4tXZd7EHt4tpXHWGUbadw1OI pDm6cttB1s5ktYvixPiK0uEcrYPS1rDRpd/jhWIiqPgJ0qNgkCGNHZPNb9uvHwsg95DX yBcFR25nK6h/cdedQ2NXZO22P+v1hNFyTnth4X3ri97x0CnBEYWKLCRmUuFQnnDoBb8P glUELWOHG67THBA0K52IzM0/DNYt3TYMvUxGbxyj4YKoc49if9Mfk7mpCGuDRZqTtqmj 769zCy+cy2GWYkaLIT6IZ+jjCczihQiAFTmitzDGkCIylfwK+DxlBvwellG+dYHSBUEQ +SSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728131950; x=1728736750; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YbmQWqwf8KLWBh53BpLjJjepp8M5DqLAMepSsUrr8/A=; b=FhHa4chOm4rUL1qhkgI4kgHWugGMqqSbRktp+1MnAvyQaQUaFnArQ4Sfgy3gxkq18U 0oPYpKeAHTy/mRE9bChbfvyS8CZ9LkU8HBztl5Z8POIcOuz1klB1LiYEZCo0TNtayLSf nAJGYv7cQ/iPnnhyQ7zKuucl0+P3cH5gJOQyw3mhWs0nCz+l1PQ55r8BJREFl7CaLNFi kwHrsKW2ZMvMm+eljNhMjVncHYTnW/T+4UVtQtduLkN/A3fHW5woXFFoxOkKW8J91j4B HWMM8hj6aEXZRJ5uLoq9dwViWjdU1AN57W0mQsThagMwcV8fASHBbz3TpKKdGeCLuzSk VaQA== X-Forwarded-Encrypted: i=1; AJvYcCUrAIWHpMG7vbUA0fsGb8Z79zUlks6DxsnOiKLrlh4Szi/QMW1nmrBK7TBz1oaYGD0psNkL3EOGRg==@kvack.org X-Gm-Message-State: AOJu0Yxz34c6LjBoakO8BTNpTE5NqIZ/5MhRrj9oUetpiSRXWyZ1humy 3Fy/OJVEogjUMECdzQZyv3nhlowuM2au8dZmVBSoVK4iWXGT+ijR X-Google-Smtp-Source: AGHT+IH7TZfPKkqMhF7jhMgZXtf7Pauw1fuqlg17S6ZMnwvz+3KYrvO5e1eqHcfYjc/CJowJUAwQhQ== X-Received: by 2002:a05:6a00:9a1:b0:70d:3337:7820 with SMTP id d2e1a72fcca58-71de23c72ecmr10006756b3a.8.1728131949813; Sat, 05 Oct 2024 05:39:09 -0700 (PDT) Received: from ?IPV6:2409:8a55:301b:e120:3c3f:d401:ec20:dbc7? ([2409:8a55:301b:e120:3c3f:d401:ec20:dbc7]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71df0cd3983sm1408702b3a.87.2024.10.05.05.38.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 05 Oct 2024 05:39:09 -0700 (PDT) Message-ID: <6cb0a740-f597-4a13-8fe5-43f94d222c70@gmail.com> Date: Sat, 5 Oct 2024 20:38:51 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net v2 2/2] page_pool: fix IOMMU crash when driver has already unbound To: Paolo Abeni , Yunsheng Lin , Ilias Apalodimas Cc: liuyonglong@huawei.com, fanghaiqing@huawei.com, zhangkun09@huawei.com, Robin Murphy , Alexander Duyck , IOMMU , Wei Fang , Shenwei Wang , Clark Wang , Eric Dumazet , Tony Nguyen , Przemek Kitszel , Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Felix Fietkau , Lorenzo Bianconi , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Andrew Morton , imx@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-mm@kvack.org, davem@davemloft.net, kuba@kernel.org References: <20240925075707.3970187-1-linyunsheng@huawei.com> <20240925075707.3970187-3-linyunsheng@huawei.com> <4968c2ec-5584-4a98-9782-143605117315@redhat.com> <33f23809-abec-4d39-ab80-839dc525a2e6@gmail.com> <4316fa2d-8dd8-44f2-b211-4b2ef3200d75@redhat.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: <4316fa2d-8dd8-44f2-b211-4b2ef3200d75@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2349020008 X-Stat-Signature: utd6o34wxrnpxei5zru3i8e96w33qejz X-HE-Tag: 1728131950-843885 X-HE-Meta: U2FsdGVkX19ExzXT7hxTDB2SN9rEUyYrQ9zudIeKInoi9SWDZ61a8Rrc98TxgjBaxXlb2lbFAPj3ZYUCttcG3c7u9HKI2Q6znANsRVpsP/KpSJy0cdOKK1uNAxxj7ZPXUqK8NTnQgIuGuTp1YIWEyeeUeMYxV5O7BbqRMB9Goqo6SreYhQTS2iH4JNbh55PeWPZDvBBuyXDVFFZy2uuPKY9uUjUQSuGfhJpeNTEgb8tD0+s74f1hZ7bW4OwNH/k63ySvzqeucD1l776BCzAoo+Cu7V/nAO/h5gLM2hwWn1NaSnLk5Z1nHpkWLKlsr98HNuoh6Mefeb5bA5JhG7xWpMI7avl/tNWg/UpEH1kB7Y1WlVQFegIm+lSjnijNUyKqzuoZEyMZmXqZZ2wVUV/QQUcirsoLtPI3FmufzwL77jb62zwArls/zJqLzNf7wDl/xMkqRWKdxKWQHHI1Ha98mJm8iUxYnufvhWtMyrfjD/NGEHceDJ0NWoSAzQCwF1F5OPJ+inH7pDWAWeyEWSa1hj375zB652HZz7Mk2HGCY3ZeEbO+XiblPq2RWvNoSGOt+WI2hohkOfA0FHbUD/WmymGHjRjM2Hvh3yl4+zcL7Pj8Zo3DAqMKfn6QYFWbAolFpi4JVkG2AWlkmg3ijTGrRchy49a8gQxb8bf7BVBKml434gyXMi8/xVmWvkI9i8OU+WVQGqetBJvDEFku9QR1RP4sJK2Al4OuoDGP92lEWRX9lcFA47W/aR711jJ+PG5otTM+9yfYeiVbzxskxHhnRzDjBeRXxukybzmp2hfaiSfOVF6z8kF5ogAUhQuOapNvTgz+/ep/3RK3RN6Sj63yRrDbfrT+ByaSJz73uHVy7DSzgUl7jHe7LMX1S6IfZ8vY5sWm9ox8uyu+RoAuA0rPJ8AroPdV4L6zNvwUVshAPjfiNeC8d7z/Enbn5/yyIDqM1nPjNA5Yhy9UW+mq482 3gMyKt9G FJAm7LszX2fIli4XX6sm74pDjxz41LlNjRv86UHWhJnIVzOo4jTG71R2sB1CI3CW8G1boxSzjKwkl+pxPJP/4SrGmtIvazv8M41Hf5K9xWbF/xMIgpuVeE3aZKRSkqaUWJ54NgBXG8sEcKVbJchUJVBbt2dOQfIn6nrLlCyGvD07FJ9Xs3ezDei6m74xUllq6WSsDesSTEbq39trpyeQuVaE5mr1xZXtWcfC6FX+HDp+aRwhiA6eTX+HVwqzMZC7BnQ8HNESC5/1M1zqHtWgBwhCgpxJ5opd/mn5sCMIO6/xkS0tR7k1+5a8oNNU8nxSQmzC/vSqkAueQXgUsol4zluVnybwPxwCiJPaFB+ScDNDfsMQ+N8xOB+TJzlZh7zgD33xUBUNk2HqbFdD6sQ3rhNoxP8VLNqrq1Fnu/TrmcQyo7DxkBprsfezLmS6hbW1zLUFNyac0hZ22e216KAelxRWh6n5AAT2IpA0mN4/ppOuNkebmFhTolLoYKc342JoSwOlVdEcXM6MR8VXalhcmf8acIOGilAhTonK9uNLdwv3maQ42QuTKVbDhACrzwjdeWan+pD/v7jDkAPek9cQp6SyRXlTDOjXb+3PKfyiYbdInNxQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/2/2024 3:37 PM, Paolo Abeni wrote: > Hi, > > On 10/2/24 04:34, Yunsheng Lin wrote: >> On 10/1/2024 9:32 PM, Paolo Abeni wrote: >>> Is the problem only tied to VFs drivers? It's a pity all the page_pool >>> users will have to pay a bill for it... >> >> I am afraid it is not only tied to VFs drivers, as: >> attempting DMA unmaps after the driver has already unbound may leak >> resources or at worst corrupt memory. >> >> Unloading PFs driver might cause the above problems too, I guess the >> probability of crashing is low for the PF as PF can not be disable >> unless it can be hot-unplug'ed, but the probability of leaking resources >> behind the dma mapping might be similar. > > Out of sheer ignorance, why/how the refcount acquired by the page pool > on the device does not prevent unloading? I am not sure if I understand the reasoning behind that, but it seems the driver unloading does not check on the refcount of the device from the implementation of __device_release_driver(). > > I fear the performance impact could be very high: AFICS, if the item > array become fragmented, insertion will take linar time, with the quite > large item_count/pool size. If so, it looks like a no-go. The last checked index is recorded in pool->item_idx, so the insertion mostly will not take linear, unless pool->items is almost full and the old item came back to page_pool is just checked. The thought is that if it comes to this point, the page_pool is likely not the bottleneck anymore, and adding infinite pool->items might not make any difference. If the insertion does turn out to be a bottleneck, 'struct llist_head' can be used to records the old items lockless for the freeing side, and llist_del_all() can be used to refill the old items for the allocing side from freeing side, which is kind of like the pool->ring and pool->alloc used currently in page_pool. As this patchset is already complicated, doing this makes it more complicated, I am not sure it is worth the effort right now as benefit does not seem obvious yet. > > I fear we should consider blocking the device removal until all the > pages are returned/unmapped ?!? (I hope that could be easier/faster) As Ilias pointed out, blocking the device removal until all the pages are returned/unmapped might cause infinite delay in our testing: https://lore.kernel.org/netdev/d50ac1a9-f1e2-49ee-b89b-05dac9bc6ee1@huawei.com/ > > /P >