From: Jesper Dangaard Brouer <hawk@kernel.org>
To: "Yunsheng Lin" <linyunsheng@huawei.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com
Cc: zhangkun09@huawei.com, fanghaiqing@huawei.com,
liuyonglong@huawei.com, Robin Murphy <robin.murphy@arm.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
IOMMU <iommu@lists.linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Eric Dumazet <edumazet@google.com>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, kernel-team <kernel-team@cloudflare.com>
Subject: Re: [PATCH net-next v3 3/3] page_pool: fix IOMMU crash when driver has already unbound
Date: Tue, 12 Nov 2024 15:19:53 +0100 [thread overview]
Message-ID: <be049c33-936a-4c93-94ff-69cd51b5de8e@kernel.org> (raw)
In-Reply-To: <eab44c89-5ada-48b6-b880-65967c0f3b49@huawei.com>
On 12/11/2024 13.22, Yunsheng Lin wrote:
> On 2024/11/12 2:51, Toke Høiland-Jørgensen wrote:
>
> ...
>
>>>
>>> Is there any other suggestion/concern about how to fix the problem here?
>>>
>>> From the previous discussion, it seems the main concern about tracking the
>>> inflight pages is about how many inflight pages it is needed.
>>
>> Yeah, my hardest objection was against putting a hard limit on the
>> number of outstanding pages.
>>
>>> If there is no other suggestion/concern , it seems the above concern might be
>>> addressed by using pre-allocated memory to satisfy the mostly used case, and
>>> use the dynamically allocated memory if/when necessary.
>>
>> For this, my biggest concern would be performance.
>>
>> In general, doing extra work in rarely used code paths (such as device
>> teardown) is much preferred to adding extra tracking in the fast path.
>> Which would be an argument for Alexander's suggestion of just scanning
>> the entire system page table to find pages to unmap. Don't know enough
>> about mm system internals to have an opinion on whether this is
>> feasible, though.
>
> Yes, there seems to be many MM system internals, like the CONFIG_SPARSEMEM*
> config, memory offline/online and other MM specific optimization that it
> is hard to tell it is feasible.
>
> It would be good if MM experts can clarify on this.
>
Yes, please. Can Alex Duyck or MM-experts point me at some code walking
entire system page table?
Then I'll write some kernel code (maybe module) that I can benchmark how
long it takes on my machine with 384GiB. I do like Alex'es suggestion,
but I want to assess the overhead of doing this on modern hardware.
>>
>> In any case, we'll need some numbers to really judge the overhead in
>> practice. So benchmarking would be the logical next step in any case :)
>
> Using POC code show that using the dynamic memory allocation does not
> seems to be adding much overhead than the pre-allocated memory allocation
> in this patch, the overhead is about 10~20ns, which seems to be similar to
> the overhead of added overhead in the patch.
>
Overhead around 10~20ns is too large for page_pool, because XDP DDoS
use-case have a very small time budget (which is what page_pool was
designed for).
[1]
https://github.com/xdp-project/xdp-project/blob/master/areas/hints/traits01_bench_kmod.org#benchmark-basics
| Link speed | Packet rate | Time-budget |
| | at smallest pkts size | per packet |
|------------+-----------------------+---------------|
| 10 Gbit/s | 14,880,952 pps | 67.2 nanosec |
| 25 Gbit/s | 37,202,381 pps | 26.88 nanosec |
| 100 Gbit/s | 148,809,523 pps | 6.72 nanosec |
--Jesper
next prev parent reply other threads:[~2024-11-12 14:20 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20241022032214.3915232-1-linyunsheng@huawei.com>
2024-10-22 3:22 ` Yunsheng Lin
2024-10-22 16:40 ` Simon Horman
2024-10-22 18:14 ` Jesper Dangaard Brouer
2024-10-23 8:59 ` Yunsheng Lin
2024-10-24 14:40 ` Toke Høiland-Jørgensen
2024-10-25 3:20 ` Yunsheng Lin
2024-10-25 11:16 ` Toke Høiland-Jørgensen
2024-10-25 14:07 ` Jesper Dangaard Brouer
2024-10-26 7:33 ` Yunsheng Lin
2024-11-06 13:25 ` Jesper Dangaard Brouer
2024-11-06 15:57 ` Jesper Dangaard Brouer
2024-11-06 19:55 ` Alexander Duyck
2024-11-07 11:10 ` Yunsheng Lin
2024-11-07 11:09 ` Yunsheng Lin
2024-11-11 11:31 ` Yunsheng Lin
2024-11-11 18:51 ` Toke Høiland-Jørgensen
2024-11-12 12:22 ` Yunsheng Lin
2024-11-12 14:19 ` Jesper Dangaard Brouer [this message]
2024-11-13 12:21 ` Yunsheng Lin
[not found] ` <40c9b515-1284-4c49-bdce-c9eeff5092f9@huawei.com>
2024-11-18 15:11 ` Jesper Dangaard Brouer
2024-10-26 7:32 ` Yunsheng Lin
2024-10-29 13:58 ` Toke Høiland-Jørgensen
2024-10-30 11:30 ` Yunsheng Lin
2024-10-30 11:57 ` Toke Høiland-Jørgensen
2024-10-31 12:17 ` Yunsheng Lin
2024-10-31 16:18 ` Toke Høiland-Jørgensen
2024-11-01 11:11 ` Yunsheng Lin
2024-11-05 20:11 ` Jesper Dangaard Brouer
2024-11-06 10:56 ` Yunsheng Lin
2024-11-06 14:17 ` Robin Murphy
2024-11-07 8:41 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=be049c33-936a-4c93-94ff-69cd51b5de8e@kernel.org \
--to=hawk@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fanghaiqing@huawei.com \
--cc=ilias.apalodimas@linaro.org \
--cc=iommu@lists.linux.dev \
--cc=kernel-team@cloudflare.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=liuyonglong@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=robin.murphy@arm.com \
--cc=toke@redhat.com \
--cc=zhangkun09@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox