From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11A01D41C04 for ; Wed, 13 Nov 2024 12:21:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CBA96B00BA; Wed, 13 Nov 2024 07:21:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47BEE6B00CC; Wed, 13 Nov 2024 07:21:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31D5E6B00CD; Wed, 13 Nov 2024 07:21:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 106886B00BA for ; Wed, 13 Nov 2024 07:21:36 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 895021C5300 for ; Wed, 13 Nov 2024 12:21:35 +0000 (UTC) X-FDA: 82780980912.25.B563B26 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf06.hostedemail.com (Postfix) with ESMTP id 702B518000B for ; Wed, 13 Nov 2024 12:20:59 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731500405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4NYvQYQah+7CjYsR7sSVU+87I6xeatIwOm0mKCvDMU=; b=jBdWWmHgvsrGct8onGhwpzOBk2yPnuraOtIEUHSG6pPTPJAAdQmp2Fy3S4EjXOgfRSiXpR tSRL0M5M+H4ZuJhJW+f61ASfOswzbEhhw2DK4ClwVbkDMTdYPyHGbhcATN5M4kt6aRzWka hp0nIZS0i6T2BLnFVRg6FYGUU0DyxIA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731500405; a=rsa-sha256; cv=none; b=SpeG7WDuiW/H4Smuk+0u9zV1hxS400kfiWQitrfZr+RHCZ6mmdXWbKvxy6qOGXz+4wLBru PDc+mmi479zFNlACteyYM6frPe7HLOvTzZ5KqzWf50SSJH167iMPDmzFQGUi6ETG/+2ndI 70OqLuvZF+JpNJDVuuuQeF2FwvLgxAw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4XpMmq0RqXz21kGy; Wed, 13 Nov 2024 20:20:11 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id A4F681401F2; Wed, 13 Nov 2024 20:21:25 +0800 (CST) Received: from [10.67.120.129] (10.67.120.129) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 13 Nov 2024 20:21:25 +0800 Message-ID: <59675831-d52e-47c0-85ca-5d3bf4d44917@huawei.com> Date: Wed, 13 Nov 2024 20:21:25 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v3 3/3] page_pool: fix IOMMU crash when driver has already unbound To: Jesper Dangaard Brouer , =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , , , CC: , , , Robin Murphy , Alexander Duyck , IOMMU , Andrew Morton , Eric Dumazet , Ilias Apalodimas , , , , kernel-team References: <20241022032214.3915232-1-linyunsheng@huawei.com> <20241022032214.3915232-4-linyunsheng@huawei.com> <113c9835-f170-46cf-92ba-df4ca5dfab3d@huawei.com> <878qudftsn.fsf@toke.dk> <87r084e8lc.fsf@toke.dk> <0c146fb8-4c95-4832-941f-dfc3a465cf91@kernel.org> <204272e7-82c3-4437-bb0d-2c3237275d1f@huawei.com> <4564c77b-a54d-4307-b043-d08e314c4c5f@huawei.com> <87ldxp4n9v.fsf@toke.dk> Content-Language: en-US From: Yunsheng Lin In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.120.129] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Server: rspam10 X-Stat-Signature: 4x3htzh7cxjso1n8y6oxeaghzerwq4h5 X-Rspamd-Queue-Id: 702B518000B X-Rspam-User: X-HE-Tag: 1731500459-307270 X-HE-Meta: U2FsdGVkX18aUvVgVkR98/e+1Gothg/RtZV+2sgBzOAnjCzym2qPujP6v/cXbuzGf6GYH6mwJLgbKcdM9n0Mprj7z81dthXGHOaLEewX3Y4R5niQuVAeIp71F7BHnvTq0oCbTvSHRrGiaONQkTBixr1W4MSc0MPfhORTTyEluD356I27SIxg03l3QPyf/hl8GWPDyDpjqPKMbz/n+9rptcbZhXAoig47OgpmtVAhr9Ic6anQIyDEVUsdHEW8nBvsG1iuim7eyyoazO/xVSPtFJ06xMQkkye/eZmH/PVm2v/vmDBTfOa11BRI/jj0oNJQdj0wT3XQLMsXjYP1GA9Z377eOFibvIBCmZuw6doAYsR5rryEF3jnmYEMJ9qBId+cNTeLDq9xgRrh2L6BH63kA64Km1sdj/8gcsy5pSF5vyxKcseLkkB+m9VlfnaOWAGulB1uTI3hBybaiVxDWAiWibVVhlzMo7rb9p2EbmDpVu/77FDHwcg2h2FzQOKrA6MhjDnxcFzqiSF9hWMwfiApUHDVo6hH1LRrsQUc4OW1KBBHfDJk4/jSZSXQQUOvcfuiSMFqFtc+0YMBbQz3pBWOMiV4uieiQxgnZohiBctzUPpDcEv7OzqyOLst29Au1FoXVNQrmWrQZbrnqU6cTZ2vKjKUbLhNh32UzACduShkNeUWlF4Tv04IeuvOgyHWRLU3mLZ9Pbau1/7V+/Te3rrvG4cK4uIRcQKasUT+E26XhVMJJhA/aj5ZYmUI2CQ+MrjQ2F/4dQBnewVcpeIdnfUFCJ46XI5iUqZgH0Y3Yrz0+/99BRVPHf3wRuFBE7qmMj8hgPjj/tZXQkpeiU1OELdanwE4tk7i+00VLZWLoK7RrtnCR2bHCIgcm5el8yTwuTUsayASQGTILaXwKJQUwS5Ta5PySE9+RCG0KRfzpmc/S2pv8fEPwQ+l6krRg1/HnEoF5CZeA9lNNCqoAIovAyi M3TgrTql /ncs5l8IolMCIeGjIFCvhWj4IFH960ZF22TK0xGHBPvF9ZPUFQQtkFtRLiqkryClyJY3WSnq+k9/YQJDP79Wvk5ZxRczYS/vDCrrM4y9lDAp1jZjjk7Mh4+7Cab5/RktQBTN7FTgBUGVJlXXyEjFInRmShrPYapByjpfGEYvBYDrG7JPzhrKtXaqIbkotqEWAc3IgPzWrkhQrLtH2hrM3epPQg1Y7ZjYrPWpRXX4wdxNMzifXzunQ+d2/2PTb/VJ2a9IMhl34+/WnIC9rXG1+ipZykcemY9J2y9qm7yNU+WQVPOHmrva6Ky2c5kkisDQ2SgP/4GzL4XRSxIeI2QKE1lBT8VJ415kF6JoNBx1Q8EPuxq0IXlzwE3jGRwPdSHxnZ0pVUH/XbaflDJGSd02d4rlQ6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/11/12 22:19, Jesper Dangaard Brouer wrote: ... >>> >>> In any case, we'll need some numbers to really judge the overhead in >>> practice. So benchmarking would be the logical next step in any case :) >> >> Using POC code show that using the dynamic memory allocation does not >> seems to be adding much overhead than the pre-allocated memory allocation >> in this patch, the overhead is about 10~20ns, which seems to be similar to >> the overhead of added overhead in the patch. >> > > Overhead around 10~20ns is too large for page_pool, because XDP DDoS > use-case have a very small time budget (which is what page_pool was > designed for). I should have mentioned that the above 10~20ns overhead is from the test case of time_bench_page_pool03_slow() in bench_page_pool_simple. More detailed test result as below: After: root@(none)$ taskset -c 0 insmod bench_page_pool_simple.ko [ 50.359865] bench_page_pool_simple: Loaded [ 50.440982] time_bench: Type:for_loop Per elem: 0 cycles(tsc) 0.769 ns (step:0) - (measurement period time:0.076980410 sec time_interval:76980410) - (invoke count:100000000 tsc_interval:7698030) [ 52.497915] time_bench: Type:atomic_inc Per elem: 2 cycles(tsc) 20.396 ns (step:0) - (measurement period time:2.039650210 sec time_interval:2039650210) - (invoke count:100000000 tsc_interval:203965016) [ 52.665872] time_bench: Type:lock Per elem: 1 cycles(tsc) 15.006 ns (step:0) - (measurement period time:0.150067780 sec time_interval:150067780) - (invoke count:10000000 tsc_interval:15006773) [ 53.337133] time_bench: Type:rcu Per elem: 0 cycles(tsc) 6.541 ns (step:0) - (measurement period time:0.654153620 sec time_interval:654153620) - (invoke count:100000000 tsc_interval:65415355) [ 53.354152] bench_page_pool_simple: time_bench_page_pool01_fast_path(): Cannot use page_pool fast-path [ 53.647814] time_bench: Type:no-softirq-page_pool01 Per elem: 2 cycles(tsc) 28.436 ns (step:0) - (measurement period time:0.284369800 sec time_interval:284369800) - (invoke count:10000000 tsc_interval:28436974) [ 53.666482] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): Cannot use page_pool fast-path [ 54.264789] time_bench: Type:no-softirq-page_pool02 Per elem: 5 cycles(tsc) 58.910 ns (step:0) - (measurement period time:0.589102240 sec time_interval:589102240) - (invoke count:10000000 tsc_interval:58910216) [ 54.283459] bench_page_pool_simple: time_bench_page_pool03_slow(): Cannot use page_pool fast-path [ 56.202440] time_bench: Type:no-softirq-page_pool03 Per elem: 19 cycles(tsc) 191.012 ns (step:0) - (measurement period time:1.910122260 sec time_interval:1910122260) - (invoke count:10000000 tsc_interval:191012216) [ 56.221463] bench_page_pool_simple: pp_tasklet_handler(): in_serving_softirq fast-path [ 56.229367] bench_page_pool_simple: time_bench_page_pool01_fast_path(): in_serving_softirq fast-path [ 56.521551] time_bench: Type:tasklet_page_pool01_fast_path Per elem: 2 cycles(tsc) 28.306 ns (step:0) - (measurement period time:0.283066000 sec time_interval:283066000) - (invoke count:10000000 tsc_interval:28306590) [ 56.540827] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): in_serving_softirq fast-path [ 57.203988] time_bench: Type:tasklet_page_pool02_ptr_ring Per elem: 6 cycles(tsc) 65.412 ns (step:0) - (measurement period time:0.654129240 sec time_interval:654129240) - (invoke count:10000000 tsc_interval:65412917) [ 57.223177] bench_page_pool_simple: time_bench_page_pool03_slow(): in_serving_softirq fast-path [ 59.297677] time_bench: Type:tasklet_page_pool03_slow Per elem: 20 cycles(tsc) 206.581 ns (step:0) - (measurement period time:2.065816850 sec time_interval:2065816850) - (invoke count:10000000 tsc_interval:206581679) Before: root@(none)$ taskset -c 0 insmod bench_page_pool_simple.ko [ 519.020980] bench_page_pool_simple: Loaded [ 519.102080] time_bench: Type:for_loop Per elem: 0 cycles(tsc) 0.769 ns (step:0) - (measurement period time:0.076979320 sec time_interval:76979320) - (invoke count:100000000 tsc_interval:7697917) [ 520.466133] time_bench: Type:atomic_inc Per elem: 1 cycles(tsc) 13.467 ns (step:0) - (measurement period time:1.346763300 sec time_interval:1346763300) - (invoke count:100000000 tsc_interval:134676325) [ 520.634079] time_bench: Type:lock Per elem: 1 cycles(tsc) 15.005 ns (step:0) - (measurement period time:0.150054340 sec time_interval:150054340) - (invoke count:10000000 tsc_interval:15005430) [ 521.190881] time_bench: Type:rcu Per elem: 0 cycles(tsc) 5.396 ns (step:0) - (measurement period time:0.539696370 sec time_interval:539696370) - (invoke count:100000000 tsc_interval:53969632) [ 521.207901] bench_page_pool_simple: time_bench_page_pool01_fast_path(): Cannot use page_pool fast-path [ 521.514478] time_bench: Type:no-softirq-page_pool01 Per elem: 2 cycles(tsc) 29.728 ns (step:0) - (measurement period time:0.297282500 sec time_interval:297282500) - (invoke count:10000000 tsc_interval:29728246) [ 521.533148] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): Cannot use page_pool fast-path [ 522.117048] time_bench: Type:no-softirq-page_pool02 Per elem: 5 cycles(tsc) 57.469 ns (step:0) - (measurement period time:0.574694970 sec time_interval:574694970) - (invoke count:10000000 tsc_interval:57469491) [ 522.135717] bench_page_pool_simple: time_bench_page_pool03_slow(): Cannot use page_pool fast-path [ 523.962813] time_bench: Type:no-softirq-page_pool03 Per elem: 18 cycles(tsc) 181.823 ns (step:0) - (measurement period time:1.818238850 sec time_interval:1818238850) - (invoke count:10000000 tsc_interval:181823878) [ 523.981837] bench_page_pool_simple: pp_tasklet_handler(): in_serving_softirq fast-path [ 523.989742] bench_page_pool_simple: time_bench_page_pool01_fast_path(): in_serving_softirq fast-path [ 524.296961] time_bench: Type:tasklet_page_pool01_fast_path Per elem: 2 cycles(tsc) 29.810 ns (step:0) - (measurement period time:0.298100890 sec time_interval:298100890) - (invoke count:10000000 tsc_interval:29810083) [ 524.316236] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): in_serving_softirq fast-path [ 524.852783] time_bench: Type:tasklet_page_pool02_ptr_ring Per elem: 5 cycles(tsc) 52.751 ns (step:0) - (measurement period time:0.527516430 sec time_interval:527516430) - (invoke count:10000000 tsc_interval:52751638) [ 524.871972] bench_page_pool_simple: time_bench_page_pool03_slow(): in_serving_softirq fast-path [ 526.710040] time_bench: Type:tasklet_page_pool03_slow Per elem: 18 cycles(tsc) 182.938 ns (step:0) - (measurement period time:1.829384610 sec time_interval:1829384610) - (invoke count:10000000 tsc_interval:182938456) > > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/hints/traits01_bench_kmod.org#benchmark-basics > >  | Link speed | Packet rate           | Time-budget   | >  |            | at smallest pkts size | per packet    | >  |------------+-----------------------+---------------| >  |  10 Gbit/s |  14,880,952 pps       | 67.2 nanosec  | >  |  25 Gbit/s |  37,202,381 pps       | 26.88 nanosec | >  | 100 Gbit/s | 148,809,523 pps       |  6.72 nanosec | > > > --Jesper