From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66BD5D6B6A7 for ; Thu, 31 Oct 2024 12:17:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C08E26B0085; Thu, 31 Oct 2024 08:17:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB86E6B008C; Thu, 31 Oct 2024 08:17:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A800C6B0092; Thu, 31 Oct 2024 08:17:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8A37C6B0085 for ; Thu, 31 Oct 2024 08:17:39 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3B7CF80CB3 for ; Thu, 31 Oct 2024 12:17:39 +0000 (UTC) X-FDA: 82733797986.14.FFCB496 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf19.hostedemail.com (Postfix) with ESMTP id 243EE1A0027 for ; Thu, 31 Oct 2024 12:17:00 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf19.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730376976; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DhJfI2aE5PVwUPugyyq/m57rnyZJNoFx4QGf787O7ag=; b=ut+/+h1hIJSHgT61FpprjEk0C2s2mWv3AzRjjaLa2ZlqKAIT+WUYcMAxz+bRJhn7md/qG/ eyqJnGr48mFJpYJhaBPuB34CjPT6wclHK12xOBH/Eb5gTIkbJDDvRVkRk8NrfSAvlOFdRu Cwj+XaH0VUsAf1hCc52AMjNQVkVheA8= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf19.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730376976; a=rsa-sha256; cv=none; b=2a3eVAfsh0ecC/uUlZ/2LiyGfA5hIc5LxZqOzhy2STNs3vyU/L9325P0tmQYboFttQygB0 JXltDsVWQbJmR19pj3A7tFUlcdA71p5+jZQ40074ALSLvopXEhJmP6c4t5O08JrNOnQOup ZsBI+8ummdz2UUmHLNAgTZNUjpJF4uM= Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4XfNGj0FYFzdkYF; Thu, 31 Oct 2024 20:14:53 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 267A118009B; Thu, 31 Oct 2024 20:17:27 +0800 (CST) Received: from [10.67.120.129] (10.67.120.129) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 31 Oct 2024 20:17:26 +0800 Message-ID: <023fdee7-dbd4-4e78-b911-a7136ff81343@huawei.com> Date: Thu, 31 Oct 2024 20:17:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v3 3/3] page_pool: fix IOMMU crash when driver has already unbound To: =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jesper Dangaard Brouer , , , CC: , , , Robin Murphy , Alexander Duyck , IOMMU , Andrew Morton , Eric Dumazet , Ilias Apalodimas , , , , kernel-team References: <20241022032214.3915232-1-linyunsheng@huawei.com> <20241022032214.3915232-4-linyunsheng@huawei.com> <113c9835-f170-46cf-92ba-df4ca5dfab3d@huawei.com> <878qudftsn.fsf@toke.dk> <87r084e8lc.fsf@toke.dk> <878qu7c8om.fsf@toke.dk> <1eac33ae-e8e1-4437-9403-57291ba4ced6@huawei.com> <87o731by64.fsf@toke.dk> Content-Language: en-US From: Yunsheng Lin In-Reply-To: <87o731by64.fsf@toke.dk> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.120.129] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 243EE1A0027 X-Stat-Signature: hfjcusus3of1rqxxuodkqyobofknxgcp X-HE-Tag: 1730377020-396930 X-HE-Meta: U2FsdGVkX19TcOY/2FSm2IzH/WOvw9B0Ie9eEJDWYjXBpSJmOn3QQ08Zxm2cfR9HazXkpYWj8/rlCP52SS8V9l9Fg1mUIrEUI8mzjSAZ+Rin9ZBgUD1SHNw6SAEvF7lWWd0gojhENCgtMVu0S6GTs04zctjKpo4xdUpT3dx0ENd0Fg/j0ZBdkjcU8N2Ih4VNrl5uhoqv/uXXC3Sc730+Tky3/VffdXSQ/MdhUddhUOXJkp0WwXJUzGeFfand13lvQleTbuAtx0OVchgOqAy4B+YbV/qk7W+I2VQqMVC10rJYOyw48SrSQZ54LQwVLR6g0w8uUHT0xCR7t7hWQ4TYxvi9uTHGJnkMZCCa+iMtS+6+IC/XQgeXgGXBUgLXbY1hwXK420bXp7D2YkspTh165h2Iw7jScyJuhe5EJxs1bCeOebDEkBt7iGShEjD/x5jLU5VXqP7186wxMKRiBSeNgYFLdAD8LkOGd+On8CB0xW3C1AfciEpmFXuLUPl6HXnRqSvphtbGVcGGxgKDqMAlgeWLCIrrXa9fFiPS9JMi3iq+fTSEUBASrpdFHygLFZ+31yVGPKMx33vSIVr8zhZH6peYmIcfLVWYjvEVfGXCs2Hn0vKg0VljyFEea3qs0H95/v9m/Xo/hvSWXDi8G8H2aBXfxjnZRojEL1v0OFuCZDFtFpS6TKw1wikG0Q5ARfHG83b7Se3x0Fbej7C2AxG1iiylv4bphcbDgaCq36cWE5cUsNaIR2ycuuLZDyeXjE0P/0CTbm2M/mH88oKX0F9OakcuYIiXBma439jYHq6QGfbYcZVjeg+9V4UBx6Q2zyj4o6LFozAGA5yBjvZ9hbc/d+CwKg5YHWeyPVJrVlD/kB0QunCcygU90VKO6oACWKFHZRnoAARvPQn5XU0ZxdAUZ0Pf3Kseon3oZcZCeWq5lJ23Tg5HVRnJjeXAiBjeHgd70pr3lCXcqrmYV88rJBj tuPfvZeo MESzN/ybpU7jxxaicQUyRiH9sgh/uYOZj5m7utVxeZQFIGorVYAAFEWdloxvEvkx7Yr22O7dIoIm+52YMNQwW2yg5nh3WOSIluUD3/TE73zqlUXxpNzN76ikcVv0aGu/SGLUxmDcuLo2BFtilr9apsXBC/HKv+5tWq3clpu1lLstc80YVq/TTpzAmZZ8zLQKCzU2wc1SWb3ljooZ7kKqrpBJymLnmjhufR1IRsbvAjClZ4irmePVGYOK8KKnKrCNwPLnHnSV6I+WTGksmyqo0O22+TxsO44VaeRfX4jZxa4dpMtefEG0KQn1dJK8e3QWT2AbL6fnw3aT7kLozBk44gNPTTF4aFzMhN4FVvQ0Jt9yPKXU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/10/30 19:57, Toke Høiland-Jørgensen wrote: > Yunsheng Lin writes: > >>> But, well, I'm not sure it is? You seem to be taking it as axiomatic >>> that the wait in itself is bad. Why? It's just a bit memory being held >>> on to while it is still in use, and so what? >> >> Actually, I thought about adding some sort of timeout or kicking based on >> jakub's waiting patch too. >> >> But after looking at more caching in the networking, waiting and kicking/flushing >> seems harder than recording the inflight pages, mainly because kicking/flushing >> need very subsystem using page_pool owned page to provide a kicking/flushing >> mechanism for it to work, not to mention how much time does it take to do all >> the kicking/flushing. > > Eliding the details above, but yeah, you're right, there are probably > some pernicious details to get right if we want to flush all caches. S > I wouldn't do that to start with. Instead, just add the waiting to start > with, then wait and see if this actually turns out to be a problem in > practice. And if it is, identify the source of that problem, deal with > it, rinse and repeat :) I am not sure if I have mentioned to you that jakub had a RFC for the waiting, see [1]. And Yonglong Cc'ed had tested it, the waiting caused the driver unload stalling forever and some task hung, see [2]. The root cause for the above case is skb_defer_free_flush() not being called as mentioned before. I am not sure if I understand the reasoning behind the above suggestion to 'wait and see if this actually turns out to be a problem' when we already know that there are some cases which need cache kicking/flushing for the waiting to work and those kicking/flushing may not be easy and may take indefinite time too, not to mention there might be other cases that need kicking/flushing that we don't know yet. Is there any reason not to consider recording the inflight pages so that unmapping can be done for inflight pages before driver unbound supposing dynamic number of inflight pages can be supported? IOW, Is there any reason you and jesper taking it as axiomatic that recording the inflight pages is bad supposing the inflight pages can be unlimited and recording can be done with least performance overhead? Or is there any better idea other than recording the inflight pages and doing the kicking/flushing during waiting? 1. https://lore.kernel.org/netdev/20240806151618.1373008-1-kuba@kernel.org/ 2. https://lore.kernel.org/netdev/758b4d47-c980-4f66-b4a4-949c3fc4b040@huawei.com/ > > -Toke > >