From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 750C7EED60A for ; Thu, 12 Sep 2024 14:25:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A51186B009B; Thu, 12 Sep 2024 10:25:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D8BC6B009D; Thu, 12 Sep 2024 10:25:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82B466B00A0; Thu, 12 Sep 2024 10:25:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 60DBC6B009B for ; Thu, 12 Sep 2024 10:25:45 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0C4401C24CE for ; Thu, 12 Sep 2024 14:25:45 +0000 (UTC) X-FDA: 82556309850.27.26DEA1A Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf24.hostedemail.com (Postfix) with ESMTP id 463A0180011 for ; Thu, 12 Sep 2024 14:25:43 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RuYFjz8p; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of almasrymina@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726151057; a=rsa-sha256; cv=none; b=gIAD73qmyHV1fb7Vn8PlFEmRGdmuMrCGLiXO+VXJSNIW/mu2A9z4Nl9ZfDL7V1hJMEFOoV GWvedWaSAdmyLfnJg17dtoxpJUGuOExflXrM3v9DdJGDg9Czj3TbBwh9FfqVUnGpNDRZNa O60EwdQ1PMx84t01/eEy8J1wWmyWSDE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RuYFjz8p; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of almasrymina@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726151057; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iwLc+XjyWhYA8wehQi12J7lKOf62muLl4QoZfdwa35k=; b=jg+w1MyHDAnVhkoc8rT1thu6Odvb3z6YSBnivh9exi9GlboBmT68zLejJ/EjQUbCq0AiHb LG58P6K32/0P3fi8B7+B+CQUYDNNQk1ipGGF63OUIhWhvZHhP5c8h3CnyBWe2LB2MwI4GS l6Vhaeow1AjgQVhcHhszF1IZrzCZTm0= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-4582fa01090so281531cf.0 for ; Thu, 12 Sep 2024 07:25:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726151142; x=1726755942; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iwLc+XjyWhYA8wehQi12J7lKOf62muLl4QoZfdwa35k=; b=RuYFjz8pSn6TDmvRc2iWYD8MK+sLX9BzwtfOfP4mbyJQlAuBt/ulVt56ut244iwrZr N68YDJNAajFOyfqe0VaJwc4ZsqWND8pGZlH80BwgykFNdzE2DtpWEx0N63XK20jBHFbU ZEodJqGa/INhus6Yu/xFWosKsRWAjHrLZlw+/jvvBMbjCPvkyaxnggDe376ehbUgC9Jz 1VKzk+7midPVXyfyUILlIcmQanV2KwYGSff9O2X6WRdVZWOOR6Mk8esvuvlMgwg+03c6 gyl3iaHzARZJH4CZvXJNfFoAZYZngOM49DNaOQrAyBtwH5mjl7JBxbDHny1iUBplKPJJ L3AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726151142; x=1726755942; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iwLc+XjyWhYA8wehQi12J7lKOf62muLl4QoZfdwa35k=; b=IG5D8Tpkac3o6g8kmnu85AeUkkoxFeawMrlbYF06kNnrDdS4SnhcX6B/NFw7aiczQf L85bCPfZXFFkkmOZKUBTXGBKRdoNIbORtR4bXv+FRuoqxgBwE2GGHz4e4cFjss7IGPS/ QZ0aVZ9Ly2ImMZGKC0wmDe302EqwCl/1RH5grwNjs1g2Fy+SG69tdyEyE+bVdpd9Hdxo lI4pka47OB8CGJqIZmlioxIW8SaAB1PBykPjKj7XCwwPNkAVrqPCAtS0BGQ9DHIhofvx bCxJA+2XNJJ3nAupUGOYxOGfV2rGFHEfovB7ccJmT5d0emE8Sqc6h+3dpUEfHtClorR0 AmNw== X-Forwarded-Encrypted: i=1; AJvYcCVZXSbFsAWOgRWP6dB4e9r4pUxICEpCelI6rRg3n2gQ/d44UIOzj300nZEQMGDGjJ4t7fVc80iTBg==@kvack.org X-Gm-Message-State: AOJu0YxrGP6Iz3JLbRn/MVpnTwMPHkck7CJ117wvtSCsbcyGLMKKTG2L ex7jrAhFzKEVerq3mNsnJIFm12Uk3B2aSHha/ESOyQ1DzgdM4NymGJlo7MzjyBTkjwWWzyGowF3 MCdU1uNuva9DIxJaVBxpjOWY+KxMNZdBQ01KJ X-Google-Smtp-Source: AGHT+IFyltmncYSZz561uPWQsvsGb5DFmLf+mGb3Vo6WS5NNhJE43ms4WKqkq6D4X3fyD6kxk1+1bYCNn1kURkFiU6A= X-Received: by 2002:a05:622a:202:b0:456:7f34:f560 with SMTP id d75a77b69052e-458608812bdmr4020001cf.22.1726151137360; Thu, 12 Sep 2024 07:25:37 -0700 (PDT) MIME-Version: 1.0 References: <20240912124514.2329991-1-linyunsheng@huawei.com> <20240912124514.2329991-3-linyunsheng@huawei.com> In-Reply-To: <20240912124514.2329991-3-linyunsheng@huawei.com> From: Mina Almasry Date: Thu, 12 Sep 2024 07:25:23 -0700 Message-ID: Subject: Re: [RFC 2/2] page_pool: fix IOMMU crash when driver has already unbound To: Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, liuyonglong@huawei.com, fanghaiqing@huawei.com, Robin Murphy , Alexander Duyck , IOMMU , Wei Fang , Shenwei Wang , Clark Wang , Eric Dumazet , Tony Nguyen , Przemek Kitszel , Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Felix Fietkau , Lorenzo Bianconi , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Andrew Morton , Ilias Apalodimas , imx@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 463A0180011 X-Stat-Signature: cyczcyrdd57txocymbhteab9c94uoagg X-Rspam-User: X-HE-Tag: 1726151143-539713 X-HE-Meta: U2FsdGVkX1+WLUXmSyMC7qYfaKiK+Dp4mcJoyG1ZKMkUCd6dSoreCNeh4LKECCWJxMoZzKcgaS7ZX7hV3v8iNZMObEzCncAN8vqmjmvL7Pv9TeJOpU8VW1VJYvUPj7l9oiKUJd4OsM1DBZhL/d1SLmqbmInFcMplPcMUlyldNB5NqNJWFoKwuEmYiaKIBZP2mAqXsIUQ4TK1dSZAf8S8+ELKB4D94T4/EhFeuW45gjzpPIEbmGRwvsweWuNXPPIeXhNmq3b6k8Y+w3CQPKGGmakUkrMI6qCrWyiuHdUjdwRj9SmLgTJBmiFiC+YjHj0bFhn+EhOliwGwqN2KnSQwp3mfJVe4byuLCZmNQSAJmIeeolUK08ttbjCZEJOPDnChGd7x5Rp+QPyJJMq6iRyX3LbbAjQICQw8JjADFZVzmaXzsVbs696AqxP9A0m+acTp7n16dpvFERdjMJWS6py0yIXcLzD5ukP1rDMvGazyDF6qN7WIzylAF5SicQcdTtzF1AHG4rEwTbVAEAUOW56zKkhKKiyAbLWRMXw7JVKdZNVI+GhVH/OkXxGWNqJAt32ued/HOkjdbKN124wRIz6vznneizNbe+jCkaOeOLYTDj7pkJybI2UraUJ7WTKe7TcRn4eLfKduA/0eaoxLtqEpqfBDkO1KrGUIVklncfk5+u8Ctj9AKmyL9Tg2BWfv2F7yrdx25nOXvhtGNJflvWi9h2dyIMHMaHv7waN9kQXo0obXkZLqUxXtqpJ6kmIkNzU5HAmoCmnNECdt86WT+tbKr4nQgTsmK4F9jaXKLRQpSmCvm7+CzA8ucdfcKtMpPouMWOwvjxQm7H73Y58FPMFSjOw/7W9dHPzkFxUFXZ8p5X3qwyaULlosjmbOVDg3cn8GCaI3vQu5O+ZVl4zMnGYkmJWvtkJVS33tV563op05NbmI5s9XX7Py8ieyuyZTonn+5uMtzdzNGRx8deMjsgY 4nhUZn/u 7a+HytR4PUQHA2o1VKoUaHPPOpD4R2W8wDTTPC4cMwJ6rbfU/FvCfieYL5y0owkRb1RlPpA+7bXeLGQRxLOvkBgmJNPSemkEalqgqY4CKYGUZWRIDAmiddooLUQP/RIyU6lkw+/oqG+eljD6w8tJhBfZPL2l5VH1GOD8bOlerkf9pIjSaOp4JqqNhOogy9eVry+MdzGC0hbVjXnixcPNGcDUHpWtiBOQF9Ms44tKiXsVlVHVdOWvrjIbT1ZUbbQelD9voi6b9kQh3UB9+h2MJZrxhBuHKUMirVOWx0EMItTsTR87YHjeN7XI+CQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 12, 2024 at 5:51=E2=80=AFAM Yunsheng Lin wrote: > > Networking driver with page_pool support may hand over page > still with dma mapping to network stack and try to reuse that > page after network stack is done with it and passes it back > to page_pool to avoid the penalty of dma mapping/unmapping. > With all the caching in the network stack, some pages may be > held in the network stack without returning to the page_pool > soon enough, and with VF disable causing the driver unbound, > the page_pool does not stop the driver from doing it's > unbounding work, instead page_pool uses workqueue to check > if there is some pages coming back from the network stack > periodically, if there is any, it will do the dma unmmapping > related cleanup work. > > As mentioned in [1], attempting DMA unmaps after the driver > has already unbound may leak resources or at worst corrupt > memory. Fundamentally, the page pool code cannot allow DMA > mappings to outlive the driver they belong to. > > Currently it seems there are at least two cases that the page > is not released fast enough causing dma unmmapping done after > driver has already unbound: > 1. ipv4 packet defragmentation timeout: this seems to cause > delay up to 30 secs: > > 2. skb_defer_free_flush(): this may cause infinite delay if > there is no triggering for net_rx_action(). > > In order not to do the dma unmmapping after driver has already > unbound and stall the unloading of the networking driver, add > the pool->items array to record all the pages including the ones > which are handed over to network stack, so the page_pool can > do the dma unmmapping for those pages when page_pool_destroy() > is called. > The approach in this patch is a bit complicated. I wonder if there is something simpler that we can do. From reading the thread, it seems the issue is that in __page_pool_release_page_dma we're calling dma_unmap_page_attrs() on a pool->p.dev that has been deleted via device_del, right? Why not consider pool->p.dev unusable if pool->destroy_cnt > 0? I.e. in __page_pool_release_page_dma, we can skip dma_unmap_page_attrs() if destry_cnt > 0? More generally, probably any use of pool->p.dev may be invalid if page_pool_destroy has been called. The call sites can be scrubbed for latent bugs. The hard part is handling the concurrency. I'm not so sure we can fix this without introducing some synchronization between the page_pool_destroy seeing the device go away and the code paths using the device. Are these being called from the fast paths? Jespers benchmark can tell for sure if there is any impact on the fast path. > Note, the devmem patchset seems to make the bug harder to fix > and to backport too, this patch does not consider fixing the > case for devmem yet. > FWIW from a quick look I did not see anything in this patch that is extremely hard to port to netmem. AFAICT the issue is that you skipped changing page_pool to page_pool_items in net_iov. Once that is done, I think the rest should be straightforward. --=20 Thanks, Mina