From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46F29C36018 for ; Wed, 2 Apr 2025 11:39:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1B3B280009; Wed, 2 Apr 2025 07:39:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCAA2280001; Wed, 2 Apr 2025 07:39:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4597280009; Wed, 2 Apr 2025 07:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9EA0F280001 for ; Wed, 2 Apr 2025 07:39:12 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 43C475842E for ; Wed, 2 Apr 2025 11:39:13 +0000 (UTC) X-FDA: 83288907786.27.7AA1F41 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) by imf22.hostedemail.com (Postfix) with ESMTP id 3C8D0C0005 for ; Wed, 2 Apr 2025 11:39:11 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GO8qRFq0; spf=pass (imf22.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743593951; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s0qzc5GVXuSt6bU4NYxI3jol0L+9rn5+w5SAhFRWk6w=; b=4rGFW9B8WSOKXTvD5zDB0UIjpeoLDO0lPZp59NZIIdzG1dRvq3M+OKN1rZLdMiRfkPlDq1 P2Bv2jSutzEg5VpkJukr/jRWNLPIoEoc8fd2pLKlIMPjnj0ZZNdOhlOG1wchqf4wHFDQW4 5Sbk8oIZC5yGEkSbxHnoB5k/R2VlcVA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743593951; a=rsa-sha256; cv=none; b=p5nDGG64ejtl7MpcRrGPg8YvQikNtV7mwZgJd20GfMb0gcRdxlArldDzAtN0nD3LiAhHo1 nCZtNr9npOdCeUAuklKQxHN7yHV6iiYRPD0uAhifIIUJFE3XZdV8pwL8gFsaZ1bvx779gX FRdfs71ndG94IWHg6sc3CSEythTuFk0= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GO8qRFq0; spf=pass (imf22.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-ac2bb7ca40bso1340358966b.3 for ; Wed, 02 Apr 2025 04:39:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743593949; x=1744198749; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=s0qzc5GVXuSt6bU4NYxI3jol0L+9rn5+w5SAhFRWk6w=; b=GO8qRFq0UK4yQlVdgQ7TQNiok8+S8ZUyaWw/a3jkIjp17Mg+aIkBSF+e+ArfEpsHnu n72r3CIwVSTHj6/1yYtbcs0MYJj8/N0jSI2Wt/CABDiw/AKY/Kjk9UcN1KGWYyOsrkW7 xpplVWar0wUN/r3Xk3aoutwHkaYMneY94pAlpYaKrGOOiVrzh11QWadZTUYuuBmylLWx rNZZSciXqd311L6aBSiIYj0reamfQSD1yyJyTPS7PMx5pIkbl6yqH7PmcDAhHbNksHJ1 z549IYRIVAaUEaXL+N2IJk9pq8Hc7K9KLlJtrZvkc2mgLdG0OJR0EobPhEDvAUSH7DjE 5twg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743593949; x=1744198749; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=s0qzc5GVXuSt6bU4NYxI3jol0L+9rn5+w5SAhFRWk6w=; b=mwnLAzXvCb1O2g3rKd63SuNAOQeGApswL8lB1bwVcQyFsxnIvTZH52M/GU5vnqZ6dC UAbjCvLfe/CKiKwUfrazdFLVEdI/JbMgt5wZeU5Z3OwGq8P44kKYDddmnH803xH9frq8 Nz3vL2u340QJNwENCIwK7jVY4kRHIPJZ7O7BT6ZKBeVIUDXKTVHIQmEWaIrRkvm9gmRu jnSm3CmW2/+pINeX+2X+8HxWff4q0sYTKklyWWdQVGCVCg9UKhgWLl4ilmjwJmmMOu0t qZAUtogsqwJT4yQTtLizZ+YhE/ItVVElUUOwAnhk2XTFaTXYDk0vaVWiUQsDDEZv7f8y tA7w== X-Forwarded-Encrypted: i=1; AJvYcCV9IfzDkfvzCBHeHAHvK1QDv5bbJ6zgkNCeJ9QSd/UjflGlSNcKca2+c6e0YNQ71k/uTD7scJwCdQ==@kvack.org X-Gm-Message-State: AOJu0YxNccyezQtRZPbzunF9n3OdbBjFz98d8KZXM4SR25OZno7ZzKhE njwyokugFADuBfEXdi4G1lfSQXpFytWi6NIRjKW5qyxW6/GJXfiP X-Gm-Gg: ASbGncuq23iXnjC3f4xvTqfWpKNIA+/DwFcbdfo2C/NAIpbkzEOp9xWeL03yBZkcye9 8tWs+P8LM8hhQsZJU4Y52obkIZFWwMuXqAZCc72GvRzUYbiCwC+8IYchdpduUw9iJyBBodIJm1l A+dtLq1jb99BDw9srowR8wSyEuzejiDbD4RRHIhtVcwhMNYY9KTwXIpnZLtDTmhp2fFDNwYGVvp C8VBZ6HIvY0GtezErse4BmfR5OYYQpxbRnRskbOXAAryVn1s8AjPcUFfP0Ai0tn7voAQ8xgw1KC JAu+YOvdJTpXLEtiL5s3bN9l0ZD4Ns8ioJGYdsCSnE+fjL1/x7jay6w= X-Google-Smtp-Source: AGHT+IH98micgWE/G7YD/wxn1wqCl92PnyTlyppPlmkUIG29k9jKNXnGuQCcoeX2OOyYg+ci5URgfw== X-Received: by 2002:a17:907:9629:b0:ac3:cc0e:90e5 with SMTP id a640c23a62f3a-ac782be4287mr524643266b.30.1743593949111; Wed, 02 Apr 2025 04:39:09 -0700 (PDT) Received: from [192.168.8.100] ([148.252.140.143]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ac71922bb92sm914833466b.12.2025.04.02.04.39.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 02 Apr 2025 04:39:02 -0700 (PDT) Message-ID: <2c5821c8-0ba5-42a3-bcdd-330d8ef736d0@gmail.com> Date: Wed, 2 Apr 2025 12:40:20 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v6 2/2] page_pool: Track DMA-mapped pages and unmap them when destroying the pool To: =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Andrew Lunn , Eric Dumazet , Paolo Abeni , Ilias Apalodimas , Simon Horman , Andrew Morton , Mina Almasry , Yonglong Liu , Yunsheng Lin , Matthew Wilcox Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, Qiuling Ren , Yuying Ma References: <20250401-page-pool-track-dma-v6-0-8b83474870d4@redhat.com> <20250401-page-pool-track-dma-v6-2-8b83474870d4@redhat.com> <3e0eb1fa-b501-4573-be9f-3d8e52593f75@gmail.com> <87jz82n7j3.fsf@toke.dk> Content-Language: en-US From: Pavel Begunkov In-Reply-To: <87jz82n7j3.fsf@toke.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: ya8r4pbbqbfo5tzqbhdboo7bo4ruqdj6 X-Rspam-User: X-Rspamd-Queue-Id: 3C8D0C0005 X-Rspamd-Server: rspam08 X-HE-Tag: 1743593950-298700 X-HE-Meta: U2FsdGVkX1/OucXmH0AG1/qc3D+jOzCJXmhZkqZ6yT2frII4OE7wyDTbw4g+GaOwAvoqMHxYGsDFnZV7j9q5fLN2lDb2C673+PdAL2RK9Baa3wZpBHHNj/Ya0ocY9TNPMS+CdZD2aIOBeqQuyOTn2UrixlodmUqtHV72BCAzPWk4CoPCMpFwq0XFN/EWKVYfyNO04kiOyh9crJNkVi7rfVFJXkLHnK944AUYVgWpQiGt3w0FklmfAucFsd3eG+Y5uOWawioZ5a4OwUQVFNAxuUUFSyv2ZWmUnVVqdNQvQYNIkKdiTaiOjihDLC0zOveYsXktoqDXMC08+h6yqfGOrj8OCzDkHt1TDM3lnQb8sFFbwTeIik68ERqtEAXrR0W94O5fxUHVJOHU56Jlr5HtNk33gfd1yGj34GMv9TK0MYAd8kPZeiJJGxsKs+ArU4/ewYqQrzKk0t8y4hRoj8be/HHa2RdDs0QwM/9CegACl/DaL1yJfafyRvrRzOk0fHtMokezWHbIOrzIsIheuQCiNntbtnbtKdeC6zyn62o+/Cl0JMUSpjEE1Q4W7D3f3cI4Lsj5rhwJy2Yud06Hc5vtnn/jjmdqaUpUuSQbsMPq9FbhlpbEkSw45mCo9DIVP4fB81gmiz13LIFsEpaUY0tsXtaNwc4cJ8JX3iF/eL60Oivh8m0TcjBsYybhw5gDz4hyZJGhdKvsxwWHRHYz5wlJqBPQ6i1pk00Kd+Itt5KAyqT0GERTRBfu65MGfxCigtT05tWqgqN5DkU3prYtyiExLWNqomGSFxml6/78fuw4eEO9uKsjDZJYr+SJTO5O1e0BH/XB9um2PVs/72t9gvId8SbDXv/FYOPG1QiYkzMZclOMMvyeq6H05ItPBFUeFB7z6ODm5UjiloNKd7INQQvBaTMf1/o+5mP4XNx1FNESDb8T0P6s02rhQS91z9EQrponNvVsMXuDCwnRfNYu6SN jNIueGBA JTyTCHh93Oy5fk2pO7hManmHy0KqFL9uNRG97JNI8Hvyycr08ioi9UpEhZT1QZ4bPNrA12D7BNSutnhsG2jQEKU1Q4kOy4qvv+D6RwzE5fKfVBkF5LGPWCX/0mcotltnriwjFgK8WHDu6dT6VXz7xCh+p0uzkP9bn1/wDMme+ommrH024hoBGKVjA00GXKhufjZ62kVnp9CcmMtc81yfx57P6zb/zFutjipOD9qxXrAIUI0xZ9xiwQ6aKJ329wknyNBZuHjsNtQzrtchkoCyqlNgwnk1Du1e9BVPKhYFkAuVVkpDlQPNrrzuulnbCtp68pP6yNV0+qg9hxLmtDrjXQhkCh5NcAI0sx+eaS941xyNcyX4wuaZB4xyDEB6/5LDYlGRB+6WI5Lcz2qThwPcj89Zuz8rQhiP/S/GDfMD5y0b/zA7vrO28G79QlQN+k8FHwvZgHzWxVgIPTVlbVKLWjIE8rnY+8ZyVaw2m4zrQ8xl9nlHMGfCzjnmcCST0WKcWtwiB82E+e7XpA3lflGMjJr3NuVAjMgQWCHqwhjiiQgNrCRZHcGSzF4B2JSzLpJg5j0pDLa6NN2BFetjPbTypA2nxm6YCC1QWIG1V8VBZdA9lvoBf/XofpybfzOwTGHvUQy7eEHR+4yIhmN/2YgkHlMmVDc0X2uNaMc2hMGLbikfrWNTqKvCL/sAxxELHJ4LFxGHt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/2/25 12:10, Toke Høiland-Jørgensen wrote: > Pavel Begunkov writes: > >> On 4/1/25 10:27, Toke Høiland-Jørgensen wrote: >> ... >>> Reported-by: Yonglong Liu >>> Closes: https://lore.kernel.org/r/8743264a-9700-4227-a556-5f931c720211@huawei.com >>> Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code") >>> Suggested-by: Mina Almasry >>> Reviewed-by: Mina Almasry >>> Reviewed-by: Jesper Dangaard Brouer >>> Tested-by: Jesper Dangaard Brouer >>> Tested-by: Qiuling Ren >>> Tested-by: Yuying Ma >>> Tested-by: Yonglong Liu >>> Acked-by: Jesper Dangaard Brouer >>> Signed-off-by: Toke Høiland-Jørgensen >> >> I haven't looked into the bit carving, but the rest looks >> good to me. A few nits below, >> >> ... >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >>> index 7745ad924ae2d801580a6760eba9393e1cf67b01..52b5ddab7ecb405066fd55b8d61abfd4186b9dcf 100644 >>> --- a/net/core/page_pool.c >>> +++ b/net/core/page_pool.c >>> @@ -227,6 +227,8 @@ static int page_pool_init(struct page_pool *pool, >>> return -EINVAL; >>> >>> pool->dma_map = true; >>> + >>> + xa_init_flags(&pool->dma_mapped, XA_FLAGS_ALLOC1); >> >> nit: might be better to init/destroy unconditionally, it doesn't >> allocate any memory. > > Hmm, yeah, suppose both could work; I do think this makes it clearer > that it's tied to DMA mapping, but I won't insist. Not sure it's worth > respinning just for this, though (see below). That's a somewhat safer way, but yes, I agree, it's not worth of a respin. > >>> } >>> >>> if (pool->slow.flags & PP_FLAG_DMA_SYNC_DEV) { >>> @@ -276,9 +278,6 @@ static int page_pool_init(struct page_pool *pool, >>> /* Driver calling page_pool_create() also call page_pool_destroy() */ >>> refcount_set(&pool->user_cnt, 1); >>> >>> - if (pool->dma_map) >>> - get_device(pool->p.dev); >>> - >>> if (pool->slow.flags & PP_FLAG_ALLOW_UNREADABLE_NETMEM) { >>> netdev_assert_locked(pool->slow.netdev); >>> rxq = __netif_get_rx_queue(pool->slow.netdev, >>> @@ -322,7 +321,7 @@ static void page_pool_uninit(struct page_pool *pool) >>> ptr_ring_cleanup(&pool->ring, NULL); >>> >>> if (pool->dma_map) >>> - put_device(pool->p.dev); >>> + xa_destroy(&pool->dma_mapped); >>> >>> #ifdef CONFIG_PAGE_POOL_STATS >>> if (!pool->system) >>> @@ -463,13 +462,21 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, >>> netmem_ref netmem, >>> u32 dma_sync_size) >>> { >>> - if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) >>> - __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); >>> + if (READ_ONCE(pool->dma_sync) && dma_dev_need_sync(pool->p.dev)) { >>> + rcu_read_lock(); >>> + /* re-check under rcu_read_lock() to sync with page_pool_scrub() */ >>> + if (READ_ONCE(pool->dma_sync)) >>> + __page_pool_dma_sync_for_device(pool, netmem, >>> + dma_sync_size); >>> + rcu_read_unlock(); >>> + } >>> } >>> >>> -static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) >>> +static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t gfp) >>> { >>> dma_addr_t dma; >>> + int err; >>> + u32 id; >>> >>> /* Setup DMA mapping: use 'struct page' area for storing DMA-addr >>> * since dma_addr_t can be either 32 or 64 bits and does not always fit >>> @@ -483,15 +490,28 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) >>> if (dma_mapping_error(pool->p.dev, dma)) >>> return false; >>> >>> - if (page_pool_set_dma_addr_netmem(netmem, dma)) >>> + if (in_softirq()) >>> + err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), >>> + PP_DMA_INDEX_LIMIT, gfp); >>> + else >>> + err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), >>> + PP_DMA_INDEX_LIMIT, gfp); >> >> Is it an optimisation? bh disable should be reentrable and could >> just be xa_alloc_bh(). > > Yeah, it's an optimisation. We do the same thing in > page_pool_recycle_in_ring(), so I just kept the same pattern. Got it > >> KERN_{NOTICE,INFO} Maybe? > > Erm? Was this supposed to be part of the comment below? Oops, yes, that's for the warning below >>> + if (err) { >>> + WARN_ONCE(1, "couldn't track DMA mapping, please report to netdev@"); >> >> That can happen with enough memory pressure, I don't think >> it should be a warning. Maybe some pr_info? > > So my reasoning here was that this code is only called in the alloc > path, so if we're under memory pressure, the page allocation itself > should fail before the xarray alloc does. And if it doesn't (i.e., if > the use of xarray itself causes allocation failures), we really want to > know about it so we can change things. Hence the loud warning. There is a gap between allocations, one doesn't guarantee another. I'd say the mental test here is whether we can reasonably cause it from user space (including by abusive users), because crash on warning setups exist, and it'll let you know about itself too loudly, when it could've been tolerated just fine. Not going to insist though. > @maintainers, given the comments above I'm not going to respin for this > unless you tell me to, but let me know! :) > > -Toke > -- Pavel Begunkov