From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E8EFC282DE for ; Mon, 10 Mar 2025 07:16:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 130E9280002; Mon, 10 Mar 2025 03:16:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E09B280001; Mon, 10 Mar 2025 03:16:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC408280002; Mon, 10 Mar 2025 03:16:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D1828280001 for ; Mon, 10 Mar 2025 03:16:09 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8408CC0B22 for ; Mon, 10 Mar 2025 07:16:11 +0000 (UTC) X-FDA: 83204782542.06.3F83706 Received: from mail-ej1-f43.google.com (mail-ej1-f43.google.com [209.85.218.43]) by imf15.hostedemail.com (Postfix) with ESMTP id 7B9E0A000D for ; Mon, 10 Mar 2025 07:16:09 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m0RVYxLZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=asml.silence@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741590969; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OwgVTEUn37oZNIQEypFt+W7Qy+BNbRvPshfrc1JXbF8=; b=6pqhv5UIG347DXL80+pu0ahOo7MpotFY8q7ScOyALO/TWAncQONMoGvE0vjW4OTNBLYSEM q9uO8FRAcvktCW6hB6GzH+owL2yC9ia81ZPLXlwgXKFH87PgJvRxsadx1rzUnRVfpusHUa AdEs6/6l/OSgxyLG8895Fxhzt3TyadI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741590969; a=rsa-sha256; cv=none; b=bRqp7DIvPz/Y3Q77PjhnWC6G2g7vkk+B8Hly/KQDJFd4NQZJJEUs1X95nT1LM3LSnidWER mpTe8RgBLkgJO72J2CsePHkUmV4hcu6H5AxXBv+szMiyS6dFWYdHivJjoS8Ip8rCp8KGQF iyZ54bP8UswbF7Pr98bIV9TlAmDcXDg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m0RVYxLZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=asml.silence@gmail.com Received: by mail-ej1-f43.google.com with SMTP id a640c23a62f3a-ac2ab99e16eso46865066b.0 for ; Mon, 10 Mar 2025 00:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741590968; x=1742195768; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=OwgVTEUn37oZNIQEypFt+W7Qy+BNbRvPshfrc1JXbF8=; b=m0RVYxLZau97bl+S6CszBd/RG6EbSdXwZfdIdX6vcaNmW6sIgbq4s64MhyQFn+uWxA gPZNM+rqktIHpcLMRWZv12n42i1IF1/wUgV4bjYdSnfYbEZ2U3pAiJ7HbesM5KZ9fuFx fCsa68eGJP5SeQgmXd7E+kkOKvFAL7RDBYSSrXg6sueYXVLxUEgpFX9k/5EYq5g0CFBX V2vy3XuRsWRxaJ+R+BcFYLHwEVqrxA8uCzragIjQJQeldEsItgFymNGSQDDx3J4lUCRU v6kykvv2CetMlM+JLdlFfITJBo5Gk2wbeULQLkc4FZS3MrNsmHHj1cVixtXmGvbRT2iq iZjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741590968; x=1742195768; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OwgVTEUn37oZNIQEypFt+W7Qy+BNbRvPshfrc1JXbF8=; b=X1MbWvmVlEezVzEtvhHlT1hlzMwYUHpBoeM8s2SMjfPey7mby2Iqhn2+fHDzuphZ61 OPpQRoUTnZSTvxyAYLz7FrOt993B0dHZL2i0oR2aQ+ywPx+Eat/MkBiobX9bXetBKUfW DujKskjiHXQhvznKS+TRt1lHaKExKOCG0WWqNvf7y0XWZfULivFMP8N0V8ggioNbsjzz 9alTADd9DbdkYD38YwC+pXN/5gaJxKrrzP43MJx2cJlUFWsfFxhoN7o228zdOdeZ/Oo7 Q1f34Izm04gvs7ImSIgSV71TmTwwret5vJ56qff7jSsL8KSAJQ2NxwSKC42vNURwaBCU jK7Q== X-Forwarded-Encrypted: i=1; AJvYcCXur+3eoSXt8UW5GSK1gfFag9A6CNOW00/GW1c5E7uSWjhJgKk3sBwn7wghK9lL8LHBJ+VNgrvaYw==@kvack.org X-Gm-Message-State: AOJu0Yy4jK0eBUI0xYsL0m8Lrhpxj1l1HwrS29FwQXELn8HmrWBbwKuc clhooA51cRhyJNlzeikBs9aJ00QvBrrS62RSjwgDLGI+Q1byWjNd X-Gm-Gg: ASbGnctQ47SwM3S7hpT0foz4YGp8wflomBPHfHMyxDNZFBQY0s10YbWbG1cVPu4NDhn j6EeETnTckzCuJ7fw35bXcDPVvwL+1E1yqnWEQEsBn8jKNHAfr2imBGuae8ve/Gs2xJxgyM+01a bWkt0qzuxUHHHMbO/UNgu5XbOrTEB9Qn9daMroepWr1RADYkovEr+Dm5Rd8cPyVYHOoNaAeUsWF LPULAYQunm4HIHzfZBJrORqJOgcXmzIn0wM9lMpJ2dC3TgnlIOyIhZUiJR9l6AdCfLLhS3d+121 lcqnQN83ItI2e1R0YA6l6UiIxjdT484Md7ba+TuwYB6uubVjhHYn9LwBhg== X-Google-Smtp-Source: AGHT+IFtQ9pvUKWQLLgKrRGjITDEfqwtZspmeMaNuf381jRmybbOqXpgP4KxMT6jEMOSHh+Nh4Qmew== X-Received: by 2002:a17:907:9452:b0:ab6:362b:a83a with SMTP id a640c23a62f3a-ac25274acf4mr1229295866b.8.1741590967593; Mon, 10 Mar 2025 00:16:07 -0700 (PDT) Received: from [192.168.116.141] ([148.252.129.108]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5e5c733fca8sm6329698a12.4.2025.03.10.00.16.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Mar 2025 00:16:06 -0700 (PDT) Message-ID: <765c84e0-1e4d-4e34-aa46-30a385ca8050@gmail.com> Date: Mon, 10 Mar 2025 07:17:02 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool To: Mina Almasry , =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , David Wei Cc: Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Yunsheng Lin , Yonglong Liu , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , linux-mm@kvack.org, netdev@vger.kernel.org References: <20250308145500.14046-1-toke@redhat.com> Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: 7hukkucm4gqx98p6wt5tmt5aaw59suh6 X-Rspam-User: X-Rspamd-Queue-Id: 7B9E0A000D X-Rspamd-Server: rspam04 X-HE-Tag: 1741590969-865341 X-HE-Meta: U2FsdGVkX18aA3slP9KFQYWkLPPkcDZh4nznF2mrz6qwPykg3CsiwBFlwLly06Ovq7uWQG+qDSgMFKmFdED4GJcYLLSatQme1Igs4I9T4sKgEckskqM7xue46v0NrkxagmrqU3IOLjbAOp6soGHl4bnjnHu2BumGBNTzbTHH0RhI2hk5oGlHeJpSOKXT5kKrFnzwzYzfmDwHITggzNvoBe7Ezsmf7PYF1jNXmWqgQLh1UPIDiCboQsYW0mxVW0GIEHXF9wKcBfMrxxTMhXSPf1CpyS0RkKpvnA/XhGM6i+wn1AUKLNBnE2hI9sFffdtWGfmd6B/eBYkYswlo3kwKHf/QZgifQI+W/NgwRhQqtsdUEw9/xBfbXTfa2lXdRgMPJx/tKza4PzkGcALu5ZK4+0kGGCLXzEjuZkTuDgsPaCiTf1gFRaoMzEN/i/+3KTLQoenLJHfVc0RkAfbrltjTVfUuOYS0pkVXRMvCg0MIVG2BCRTJvB6hSTlJ5LsbRGZ//wULb+n985BVQbPAbVECPrN7O47tAMn4LaPq5yk705IPGtQE32h2FKnRap6+5obmMeaVGnyHvQjZDbx3O7MHFVtghhiQXZ/IL+6wYQHfHQ4OYZw9TcaXFwlvwtvj38Qjr7ucF/bYTMwIr3JJ+1lu5bCHhC5l+k1Rs8n7n2eFfvzEbwMjgxBMCNF9tgn+A9nUkQUyERMtPpsTbQebKKlVF+rzdc4wrnnAAweMowYpSR+JOUow4FHNLdMBwbRcc+JYxg9HJtOPWRS0NLlCLdEzvUhMNzswlKbQkndDTdsPmGYaJwRyqPmEv7KsOWxPEd1KcEXfzxJXyOuxaMZg+HPjIAa13MCkigRhsl1daSc/En25DKOQ7aQqC2jC9zLsqYpX9KG42G5ACe8wKikUqHsxsFsKxd/ogp8+vgcanRqFmW9HapvSUL+CbuEDZkxrgKLk4t4tp4zk5xhmndVFcFt 4JSnmFWq R/j/8sdxNk6zyJs3+U7v0GxEz3cDhB8fFSaaVSihwx933gMBf3LSS1DI/se+ZL9gpqglgtZKivMPqsgb7bkk21w+ey23pKn5DcEk0n/+mvEN1eBJSfsBVPQBAN5NrBnWWsnJ6qAgkR/Kpj2hMEBuGIL/vnFIID3D6IvgoGFld0n6r9It3CN/5qT/V4DXnUEpxPcMWuHfjjC9Nwy60MfnlGy2BHzQ19SWHsdYjM5nL1aBXzdhzV29tk1ugW/HmDafn2Phiyz3kUdl/WGctqUPOFixUTwJJs58ZwdxgCv6zpidhAYCNn3Cqi1brsISEaR1VQqbbTPVIxuTGX4trUc1kRtAFgeEFVTidno+eY782qhKBphp2O2CteUnWOrHcCiWvr12WMvBSMc9D+QIbPCS/iYiAu07CIPIGuuqa8T8fFHA/wVB+pobfRFCXI6sqP88Im3Ko38Sk8UBTANGaIhH1lNOuveCTDNN5FSM1rNfT3ESroUKXp/yhoQudgqkYkGvql1FiO434TcbDSqLA5hr8kS8eWE+IBTi9lRwssRrq/aJNfsKSeC7+PvNAQnZlt4AJgoHM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/8/25 19:22, Mina Almasry wrote: > On Sat, Mar 8, 2025 at 6:55 AM Toke Høiland-Jørgensen wrote: >> >> When enabling DMA mapping in page_pool, pages are kept DMA mapped until >> they are released from the pool, to avoid the overhead of re-mapping the >> pages every time they are used. This causes problems when a device is >> torn down, because the page pool can't unmap the pages until they are >> returned to the pool. This causes resource leaks and/or crashes when >> there are pages still outstanding while the device is torn down, because >> page_pool will attempt an unmap of a non-existent DMA device on the >> subsequent page return. >> >> To fix this, implement a simple tracking of outstanding dma-mapped pages >> in page pool using an xarray. This was first suggested by Mina[0], and >> turns out to be fairly straight forward: We simply store pointers to >> pages directly in the xarray with xa_alloc() when they are first DMA >> mapped, and remove them from the array on unmap. Then, when a page pool >> is torn down, it can simply walk the xarray and unmap all pages still >> present there before returning, which also allows us to get rid of the >> get/put_device() calls in page_pool. > >> Using xa_cmpxchg(), no additional >> synchronisation is needed, as a page will only ever be unmapped once. >> >> To avoid having to walk the entire xarray on unmap to find the page >> reference, we stash the ID assigned by xa_alloc() into the page >> structure itself, in the field previously called '_pp_mapping_pad' in >> the page_pool struct inside struct page. This field overlaps with the >> page->mapping pointer, which may turn out to be problematic, so an >> alternative is probably needed. Sticking the ID into some of the upper >> bits of page->pp_magic may work as an alternative, but that requires >> further investigation. Using the 'mapping' field works well enough as >> a demonstration for this RFC, though. >> >> Since all the tracking is performed on DMA map/unmap, no additional code >> is needed in the fast path, meaning the performance overhead of this >> tracking is negligible. The extra memory needed to track the pages is >> neatly encapsulated inside xarray, which uses the 'struct xa_node' >> structure to track items. This structure is 576 bytes long, with slots >> for 64 items, meaning that a full node occurs only 9 bytes of overhead >> per slot it tracks (in practice, it probably won't be this efficient, >> but in any case it should be an acceptable overhead). ... > > Pavel, David, as an aside, I think we need to propagate this fix to > memory providers as a follow up. We probably need a new op in the > provider to unmap. Then, in page_pool_scrub, where this patch does an > xa_for_each, we need to call that unmap op. Sounds like it, which is the easy part since mps already hold the full list of pages available. We just need to be careful unmapping all netmems in presense of multiple pools, but that should be fine. -- Pavel Begunkov