From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6603CF649A for ; Sat, 28 Sep 2024 07:35:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A1EF6B0198; Sat, 28 Sep 2024 03:35:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 351A06B0199; Sat, 28 Sep 2024 03:35:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 219AF6B019A; Sat, 28 Sep 2024 03:35:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 046526B0198 for ; Sat, 28 Sep 2024 03:35:05 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8063016116D for ; Sat, 28 Sep 2024 07:35:05 +0000 (UTC) X-FDA: 82613335770.11.889AA97 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf13.hostedemail.com (Postfix) with ESMTP id A238C20004 for ; Sat, 28 Sep 2024 07:35:03 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=js0n3Hkl; spf=pass (imf13.hostedemail.com: domain of ilias.apalodimas@linaro.org designates 209.85.210.174 as permitted sender) smtp.mailfrom=ilias.apalodimas@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727508840; a=rsa-sha256; cv=none; b=rgrPIxYXK18w/UFAueU+1ENvl4UrdIZb9pBUO2zsjsV51nTGPWWcuGIROLTj6hUiCP1vA5 Grr3FaRlLU53j/8mP1FrckjelfxYC/KjegLraz3sWw5IEfDgtcMjhUtnqjbV9i2hfhxCZv qoC+MaCnWDo3TcuA98cuJYaSPDtlquY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=js0n3Hkl; spf=pass (imf13.hostedemail.com: domain of ilias.apalodimas@linaro.org designates 209.85.210.174 as permitted sender) smtp.mailfrom=ilias.apalodimas@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727508840; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CZ3GJJVCrrzGvR7fAA0nSrjc16wiFeyBC54h9NaWyFQ=; b=ctbCZY9CM3ZMAsb+aFfW9Nbo5re8b4ilRw/havXrHlEaJqNZ+bMFZ0BYSenzex7y3Y8rNu TsioQIMgAXSwqfxOhSHz1+8kikE3cXNXU/vAF5VvN4rjcvwQ8G7rvuFnhZpxRanfzvBtim W38M61UpnYjgEkonFfeRwPZyp30npr4= Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-7198cb6bb02so2115760b3a.3 for ; Sat, 28 Sep 2024 00:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1727508902; x=1728113702; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=CZ3GJJVCrrzGvR7fAA0nSrjc16wiFeyBC54h9NaWyFQ=; b=js0n3Hkl1J/Rq91OAznimGi8+lzX2QplZwYNH5DUB7dcv9SMkV1meKvCL7USu4OrS/ S49am2bcqDjQl86rZtESib/El99tjB5Cyu9mN/SFmy5/jUTy799bu34Wi3TmUkvb0eFC 36Q2QYp3aSXSqM8WWuY9YdOwOvXw4WiQKrexWe6VmiIwZZJDHLUlm56frWzOIpC5NcWb HV10J86HsStB+8G5jrF3mWjT5YzxdQIbV9vr5mMIqs423aB+8MAe6AFW+9SbAIUGCIEj 6XlpbbRIXm6WcWJETMAAguWXQsTR65d9OP0js9crpAItMAswSL+rtTXOF/MhVnf8VWmq zuMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727508902; x=1728113702; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=CZ3GJJVCrrzGvR7fAA0nSrjc16wiFeyBC54h9NaWyFQ=; b=jisp1VZXSP3oqdlIMQ5DyEESpnDFpxIlr4lYXV8fGfJbfqCiWEFo7PNC+6nNzVO16E RgJleZC48deHISKgkddxeRt2aEMJjz8KW/3eCFK9XUC2xEzEQAiPMXyMEWqypqvMqqnN Lr2R72y1szMMWEKZTte3F+g0K53HRA/Ot3w9HrJCmm7d1uAi5PvjkPJ9M+Zu5beYTnsn 4EBC8pqOX/pM87T42QgzYK87isA6iGep3HakrC7aJsIdUgpJxr8gjxIKhWqojxQNihFA V9aIZcQivfW/KAUXo/NUltwNxQIZJSAz53CefTjxlGwprwWT6qNqEBHn8jbYY+aToNvT C4Fw== X-Forwarded-Encrypted: i=1; AJvYcCV4acwWxb3XtbysKrF7NzjcnS3/cdwMJa+Aegug+Iehz0hkC0dEGfi6ETYuyF+L0OLrS29FUFbz3A==@kvack.org X-Gm-Message-State: AOJu0YwKyE9XB9NKXvfNEhZ/ckJq4yc7O908HCRfzqDaPfcLa1rsdhp0 aXNnsWGB3lombx+GHLH1XZfTRZ9W/p+QXUJUzWFH5slF2AuPvpIoiaaKCxbiFMRd4MXz3+18XpL AG6aXRFXVGPrcaLoiPNjaEH0BGBkJevV5LPVL8A== X-Google-Smtp-Source: AGHT+IE8STUqnUdbp16wn+jWytnLJo7GbATydnL1PXc5q5xA5jCeVCG+33+GkdczeH+HdjoM/t8GaXFW5O+RCkxcmVg= X-Received: by 2002:a05:6a00:8c2:b0:714:2482:ab3c with SMTP id d2e1a72fcca58-71b25f40b4cmr9950991b3a.7.1727508902091; Sat, 28 Sep 2024 00:35:02 -0700 (PDT) MIME-Version: 1.0 References: <20240925075707.3970187-1-linyunsheng@huawei.com> <20240925075707.3970187-3-linyunsheng@huawei.com> <842c8cc6-f716-437a-bc98-70bc26d6fd38@huawei.com> <0ef315df-e8e9-41e8-9ba8-dcb69492c616@huawei.com> <934d601f-be43-4e04-b126-dc86890a4bfa@huawei.com> In-Reply-To: <934d601f-be43-4e04-b126-dc86890a4bfa@huawei.com> From: Ilias Apalodimas Date: Sat, 28 Sep 2024 10:34:25 +0300 Message-ID: Subject: Re: [PATCH net v2 2/2] page_pool: fix IOMMU crash when driver has already unbound To: Yunsheng Lin Cc: Mina Almasry , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, liuyonglong@huawei.com, fanghaiqing@huawei.com, zhangkun09@huawei.com, Robin Murphy , Alexander Duyck , IOMMU , Wei Fang , Shenwei Wang , Clark Wang , Eric Dumazet , Tony Nguyen , Przemek Kitszel , Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Felix Fietkau , Lorenzo Bianconi , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Andrew Morton , imx@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: qnzfno57yzehbebkxnm11weyk3eq9s9i X-Rspamd-Queue-Id: A238C20004 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1727508903-808337 X-HE-Meta: U2FsdGVkX1+x9CuvVww/i/XGxbpmtlPiemvBE0oKC2ZdhDnYT/q/YZA2c6lgspjv5J5ZNb5QtNL8HP60vMR3DQMbdvPYDMSUP/2TFeCMc0ITSCRSxvy1HtuN5LyFte09q9DSskpw+TRNeKTnv/2nnIjxdZYoGm6SMu4lIT7Wdk4acasK0Sq+fC6A1f0dntrX6sauXRo9loBPBTh82mLPjUQwj0XKwk5UjIP63BwivLANf49EGDxHP0FnWSLa58EVTPyHn1A116WvCzpPo+CxHuQwn22pYAwuPrC0/Y/W3rAyt+1lYtdI7e1dz9BbaGYT6OqQAuIO75veRMTBwQamJ3fC/mIZpHPwa1ffDBrgqHWn0DclKmoJB6g1vFRXpkAtKW3kFNigpoq2xCA7ZnOH1uL6hVqEo6FeZ6i0/jvGKNRDHQL0ouTalgoORs2YN0zXYFIGessCeTZJTOhzpnbjexliet7kpkwXZ28GJQfqEx08Zwp8iquFtMG0XoyXzUecA6zvEt2K6ObX25WEh/v0tm6lD+dd8RGCF9OvUueNDtjf06zPGUs4cbi5ZUZWOaW6Do9yacgLWZWgHb3aOEwiXX3gok/FxgSdxgGto99mb7NiEZvTic2FpExG2k65qW8/PXdoUQy+7mYtUq0SZawEtb5B7FAK1AQFiZXeF3As3Cc3xlm/bDT5aGrkFgYnRZs5NFL1/jd3NBxeeEMrBE5SKUDNwDw0zzqpVeAXK5eNATT13w2IoZrlofikL+vdjVsOGMCGfDfyliirVlEmXO6ejQR2h2x4FCGYp+YV0F7+M8kxHwMAVAFhfrnM6VH9+ZfErABS3XjcKAns5qP0QNk5m6KaHbCQB9Pfj9mUSBCI/M5/MdycYmZ0puYne25DT22mLr4vqNY/7ftQCqrHpulkjybxWzza6NlT4Tt9i9LYf7LSvCbrYzGYAKcSWc8z5gHuwjIYcrqb0eAFczDgDmd 9L75fU3E fKnwTfXs+OifNPCZjwRp36PzdKgpa7SqI6D83D7z5lH4gzXbfj+/eCgRpCkfs+COAWS9Q06d4L7rG3xr2k3WxikuD3rwQ/rQN9olxZuzzz+HJAWFNPXHdsI31DXmeYQ83P9BdfNIUDIw93gJ5k3keGTa+lu0nFWXk/uOV308zMX4XD9eoHhvw9fznS/n3po39zYe0wzK6SPv/BKyywh60Ge31WPBYq29dpr2HZxCcdaxRqCmB7030I7mUiTcSarZJcehax1NaNqnZkN1UeHdxG2kQDQyHastTLJ8pc/TITARSOfrVj76S/QanQr35d24mMZMBmPRDUfpfmQBFUM9cpGML85FBOcKuVD138qLwOuTD4RePZFgRMx7a2ZIYIEeLithnSLkr15TpiaQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yunsheng, Overall this is a patch in the right direction. I want to get feedback from others since Jakub and Jesper seemed to prefer the stalling idea. On Fri, 27 Sept 2024 at 14:29, Yunsheng Lin wrote: > > On 2024/9/27 17:58, Ilias Apalodimas wrote: > > ... > > >> > >>> importantly, though, why does struct page need to know about this? > >>> Can't we have the same information in page pool? > >>> When the driver allocates pages it does via page_pool_dev_alloc_XXXXX > >>> or something similar. Cant we do what you suggest here ? IOW when we > >>> allocate a page we put it in a list, and when that page returns to > >>> page_pool (and it's mapped) we remove it. > >> > >> Yes, that is the basic idea, but the important part is how to do that > >> with less performance impact. > > > > Yes, but do you think that keeping that list of allocated pages in > > struct page_pool will end up being more costly somehow compared to > > struct page? > > I am not sure if I understand your above question here. > I am supposing the question is about what's the cost between using > single/doubly linked list for the inflight pages or using a array > for the inflight pages like this patch does using pool->items? Yes, that wasn't very clear indeed, apologies for any confusion. I was trying to ask on a linked list that only lives in struct page_pool. But I now realize this was a bad idea since the lookup would be way slower. > If I understand question correctly, the single/doubly linked list > is more costly than array as the page_pool case as my understanding. > > For single linked list, it doesn't allow deleting a specific entry but > only support deleting the first entry and all the entries. It does support > lockless operation using llist, but have limitation as below: > https://elixir.bootlin.com/linux/v6.7-rc8/source/include/linux/llist.h#L13 > > For doubly linked list, it needs two pointer to support deleting a specific > entry and it does not support lockless operation. I didn't look at the patch too carefully at first. Looking a bit closer now, the array is indeed better, since the lookup is faster. You just need the stored index in struct page to find the page we need to unmap. Do you remember if we can reduce the atomic pp_ref_count to 32bits? If so we can reuse that space for the index. Looking at it requires a bit more work in netmem, but that's mostly swapping all the atomic64 calls to atomic ones. > > For pool->items, as the alloc side is protected by NAPI context, and the > free side use item->pp_idx to ensure there is only one producer for each > item, which means for each item in pool->items, there is only one consumer > and one producer, which seems much like the case when the page is not > recyclable in __page_pool_put_page, we don't need a lock protection when > calling page_pool_return_page(), the 'struct page' is also one consumer > and one producer as the pool->items[item->pp_idx] does: > https://elixir.bootlin.com/linux/v6.7-rc8/source/net/core/page_pool.c#L645 > > We only need a lock protection when page_pool_destroy() is called to > check if there is inflight page to be unmapped as a consumer, and the > __page_pool_put_page() may also called to unmapped the inflight page as > another consumer, Thanks for the explanation. On the locking side, page_pool_destroy is called once from the driver and then it's either the workqueue for inflight packets or an SKB that got freed and tried to recycle right? But do we still need to do all the unmapping etc from the delayed work? Since the new function will unmap all packets in page_pool_destroy, we can just skip unmapping when the delayed work runs Thanks /Ilias > there is why the 'destroy_lock' is added for protection > when pool->destroy_cnt > 0. > > > > > Thanks > > /Ilias