From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DFF8C28B2F for ; Sun, 9 Mar 2025 12:42:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 63E1B280002; Sun, 9 Mar 2025 08:42:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61F42280001; Sun, 9 Mar 2025 08:42:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4DE3D280002; Sun, 9 Mar 2025 08:42:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 33CA9280001 for ; Sun, 9 Mar 2025 08:42:39 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5ECFF1CE1DE for ; Sun, 9 Mar 2025 12:42:39 +0000 (UTC) X-FDA: 83201976438.16.22B4AB7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id F0213C0008 for ; Sun, 9 Mar 2025 12:42:36 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aeNf6moa; spf=pass (imf22.hostedemail.com: domain of toke@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=toke@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741524157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wakJHmryr/W0Yg7ZPLkHn/l/K42kHDSBHjc6PXqPPV0=; b=tICE0wsDsJ/+FAEHOWMwsNTDOJelR8GYEQHcoBigMjnuVKmGMTEDtcGWqHny+Aucs659XZ r9uQGPfNvnWAhXaUryJJpStQ2dph/Q9wnN5+chdQsJOGuBxPaabB8nqz3/z7ntbmG+TEqe oFaDO+hiu779CDbpkVGHwsUF9OCjFXo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741524157; a=rsa-sha256; cv=none; b=5CTspC0Us8al2G9DfEUFPCSz1DLic2aymvejVmiWpmHRAWsQ1AXCfBJFD06LNZVkZEyNRn 7bafUDCKxloT3at1ske5QxBE500q4XbxQ1UPhsNxOpMZ8BoNl5a1ihFzpucB0mCOws2fLv 5A7kNF3cWN4CISHhwWU9p7+ryifYUps= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aeNf6moa; spf=pass (imf22.hostedemail.com: domain of toke@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=toke@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741524156; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wakJHmryr/W0Yg7ZPLkHn/l/K42kHDSBHjc6PXqPPV0=; b=aeNf6moa/MI+MrT4GZ1YO1deYuJ/uYf5oQTGqWt7IBSgO4f37yQlpmfXrgF6QT69XwV6xR WNL7bQVhCVFB94Ib/0mrTS3OWf5KKTNHkkoEq8n2Aj5fwlCrtUwMol2BneGqpgAe6T2C98 PiFNrgTwsxt7pLdjzai41DnWWjJAh2c= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-220-lGw_tOYnOUGxg-kLTeVyvg-1; Sun, 09 Mar 2025 08:42:34 -0400 X-MC-Unique: lGw_tOYnOUGxg-kLTeVyvg-1 X-Mimecast-MFC-AGG-ID: lGw_tOYnOUGxg-kLTeVyvg_1741524153 Received: by mail-lf1-f69.google.com with SMTP id 2adb3069b0e04-549999af7bfso480108e87.0 for ; Sun, 09 Mar 2025 05:42:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741524153; x=1742128953; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wakJHmryr/W0Yg7ZPLkHn/l/K42kHDSBHjc6PXqPPV0=; b=FuLaY8OhbitLbD520TJ0GA53P2C6PjZl1TJZsvwnfJlI4gjnLSZI0eaCczyEN1KWIp 00CI5KPIedvPpVwNtMET5anVNLrNJva7ZLVIN/RES9MVY2wrZEKUbtL87qa/RnAswOo6 pppSAV3rNZo2UzNLX1tK0Hbb1nlu6xwd9tKLeqXDG5g98z583a1kcKcHZLhPbqT1+s5i ewTusU/qpfaGdoTuWAEJu735JQNKfC+uTv93jfJaYgGiDHFkezD5VN6+9m88JNUtxGDg gKNC0dbz5mpHWmGjhm5ybwcu+N1zJEMkbmvqNYADlj3NOPVKs2bXqO0iCzFRKPVejVSC t+aQ== X-Forwarded-Encrypted: i=1; AJvYcCUDBskMw1+1q0y0J4o5Tru6L11OgEOLiaDnlXBqfNC8dT69kj2FwkGDifzfvX6HUa88MSG0cG9Jfg==@kvack.org X-Gm-Message-State: AOJu0Yzt8qWtRYbgb4ohvFCgSHOOQnn/hnxmpFwiu2pF4HB2brgexy6r Knk7/SxJgaa/KFlMcEhLodVhc03qzNZB0Az1ipvA1Oof/KWLHJqQZ5WLXP0oQ2MCMlg3LVg60Vr Z8kmsnq/EqhV+Hl4GbNGBtZUlML8HvHdkG1Iwnk02UwmsiEKX X-Gm-Gg: ASbGncs6wq9co5/XLgPGsmwyqG7UlVD1lfO8/fZBlVRMJ98dDTVdb6JRcG8Uxwtf/NS 1wkUVh863/Te2FF6NX94izFJtFDXx31jTLsPhqe/5iSjgJfc6abq7yWJbTKgBcTbTOfusoLtWZm we7Lt1re+M3nPuGTJk1WqllgpfaiWD8KakhujYVcoCwnxsl5JQMPSzLuCnqMTp2wR8PThMtsBnM Enh94tMcocb6mv748F8YfyGPxNKCoPVzc2Gy8M9s3gutMFRE2TUnu+3VxK1UfN7u4NBCawNGtW0 r6vKjbbBn4lSBoGDjbl1ezc0MrKN/susEbh4iY6z X-Received: by 2002:a05:6512:1598:b0:549:8d67:c48e with SMTP id 2adb3069b0e04-54990e673a2mr2988045e87.29.1741524153257; Sun, 09 Mar 2025 05:42:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGmmLLcrdnS8oM64WmWKJSQFGa8sEv7PMWa8+/+Bd/zZCIJEK6j2YUdeN/6EfK00WFIH8P5Pg== X-Received: by 2002:a05:6512:1598:b0:549:8d67:c48e with SMTP id 2adb3069b0e04-54990e673a2mr2988041e87.29.1741524152845; Sun, 09 Mar 2025 05:42:32 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5498ae58c0csm1103754e87.59.2025.03.09.05.42.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 09 Mar 2025 05:42:32 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 53DF818FA18C; Sun, 09 Mar 2025 13:42:30 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Mina Almasry , Pavel Begunkov , David Wei Cc: Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Yunsheng Lin , Yonglong Liu , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , linux-mm@kvack.org, netdev@vger.kernel.org Subject: Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool In-Reply-To: References: <20250308145500.14046-1-toke@redhat.com> X-Clacks-Overhead: GNU Terry Pratchett Date: Sun, 09 Mar 2025 13:42:30 +0100 Message-ID: <87cyeqml3d.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ALlvxlsocYBAboFzBsB9dUKpbYfK2ymG7LEljhOenWU_1741524153 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspam-User: X-Rspamd-Queue-Id: F0213C0008 X-Stat-Signature: ysc3bg3yecam4ckjddyten9iosptjc53 X-HE-Tag: 1741524156-86597 X-HE-Meta: U2FsdGVkX18gU1xVfqh6+S7ihOx0SKsuj6q6+gIMJa7ZT1omtCZMx8Jo191RpkmuoiSHaohTEMa4wycLSbq1YjHKGIxzaEZgiT3+snxjnJAbe9qm4b3DvsOsOPClRAbaJ8jPJslHMtty4MjIQjhdFqPNIRQm1ACXly815zoRGgDjGYDLBUtWd84PTKJT4M/OP04U8ENj6XWAfrxibRzj9nJtko6Uhzmh+waiPqIb+DuNH74s2/JpLND+LKsxrrpSKDB77BCwrshSUqnb3e/h+feX7O21Em8T+VSz05zvUSbRw0MMuTo0z3BUSkVPWD0W7LJAnYJBbMxth6Ha/vwpokYbRZXcMeI2Dbz0u+ySzgFcnlwqeniOuzwAZxmNUlUAKj+yWRYc+aGqWcG5jaijvWdRfDmwRah7x75wNhs2eATO/nTIwQ1jaBQoi7vHFi55/3PtdincUcmlSq7NQ/3Qiah7rQL3K5xKB5yrlvylcVNV4eESwNslCJ/gVf6dph5VI7nV3o+QQJ/7+h12hz65kqqRfqDR4TxxgiO2A6GOnQ3MJmm/tSTPhZanv0cM33bMWvKOX0xjotgdHj2WboNjlUOnEp+dNNYo/KrhVopuQ81s8vWUsvjKZAfdwnGokZPYcuRwwu7p89BmxL0Zh56TD8r5HWOkMEY7IqVUhJWeNPpheGwflA9rAZG/3pqd2hf/niN/A10raMvWM5/6sNFeGeyV6G/7jNHDdu/pJgA5duQEzQNwXBHekyJDmWkjgdHXKMlys+4AFsrOprtSfUMTcIS32FFwflpszVejbDHCMQubLvrN622XohjS0wrqk9bBfgN8c+w+eW6xft8dy8IhnlCrNn3ICiIsgCSciVpOVrYhNXN7XkXJLclMDc2RZWS/7gKmqhHN/pOrV1B2l8w5Xe/44+HxzoxF06FXsbPKMgqiC9OKuq6lSruGPLS30pyvt8gdihfsKbGhDSBb728 WotFWfct juGysecEWCTJfpYX+gqfuRIs98wbkD34Cpq4F7JmMEoLkOwzopvAf0/oHPyksyouAKOr5VKCP5+cD0s0GMouipLKR4zqNfNCYBd1U4KGd2nWGbBHlUeofFivza6d8gVSLuNybBfBjXk+wMj7gdmjWXvbLhvofxLRvw8sm0O8qBlCzZHiL37ugKEaV0KpKBfPns3mXXq0/8VR1Utbf1SLQIKQ81ucwHgoE9MgbmwZJ5FtlEQ6MgH5qFKqjfJnU2Ue4xYJLCCiGlNrElviV2cntmgZbAD9JgTzsj7kSbb20zw3qwdQqx2G0nEIbH7ZDI9v0qtA3HwVkbXbPRgidSgOhAsenT1+WWoLxfy7g/tpCD7QoKDZKJLHTtDjhGVUbe7w9gH+04wF7BYRBjYTje11dxcf3r4szPblpntI9GECTDlR7gjoVU7lAroEYapgx+WWQMLQw9Wgu883PEX9LXZVAMie8R5/0LmUA5VM6cDIR0LrMOT2E/FpgSMxJJ4OpHeOqZpFlbRun7ePW0sg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Mina Almasry writes: > On Sat, Mar 8, 2025 at 6:55=E2=80=AFAM Toke H=C3=B8iland-J=C3=B8rgensen <= toke@redhat.com> wrote: >> >> When enabling DMA mapping in page_pool, pages are kept DMA mapped until >> they are released from the pool, to avoid the overhead of re-mapping the >> pages every time they are used. This causes problems when a device is >> torn down, because the page pool can't unmap the pages until they are >> returned to the pool. This causes resource leaks and/or crashes when >> there are pages still outstanding while the device is torn down, because >> page_pool will attempt an unmap of a non-existent DMA device on the >> subsequent page return. >> >> To fix this, implement a simple tracking of outstanding dma-mapped pages >> in page pool using an xarray. This was first suggested by Mina[0], and >> turns out to be fairly straight forward: We simply store pointers to >> pages directly in the xarray with xa_alloc() when they are first DMA >> mapped, and remove them from the array on unmap. Then, when a page pool >> is torn down, it can simply walk the xarray and unmap all pages still >> present there before returning, which also allows us to get rid of the >> get/put_device() calls in page_pool. > > THANK YOU!! I had been looking at the other proposals to fix this here > and there and I had similar feelings to you. They add lots of code > changes and the code changes themselves were hard for me to > understand. I hope we can make this simpler approach work. You're welcome :) And yeah, me too! >> Using xa_cmpxchg(), no additional >> synchronisation is needed, as a page will only ever be unmapped once. >> > > Very clever. I had been wondering how to handle the concurrency. I > also think this works. Thanks! >> To avoid having to walk the entire xarray on unmap to find the page >> reference, we stash the ID assigned by xa_alloc() into the page >> structure itself, in the field previously called '_pp_mapping_pad' in >> the page_pool struct inside struct page. This field overlaps with the >> page->mapping pointer, which may turn out to be problematic, so an >> alternative is probably needed. Sticking the ID into some of the upper >> bits of page->pp_magic may work as an alternative, but that requires >> further investigation. Using the 'mapping' field works well enough as >> a demonstration for this RFC, though. >> > > I'm unsure about this. I think page->mapping may be used when we map > the page to the userspace in TCP zerocopy, but I'm really not sure. > Yes, finding somewhere else to put the id would be ideal. Do we really > need a full unsigned long for the pp_magic? No, pp_magic was also my backup plan (see the other thread). Tried actually doing that now, and while there's a bit of complication due to the varying definitions of POISON_POINTER_DELTA across architectures, but it seems that this can be defined at compile time. I'll send a v2 RFC with this change. -Toke