From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1B10C28B2E for ; Tue, 11 Mar 2025 13:24:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DF3A28000E; Tue, 11 Mar 2025 09:24:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28EC4280001; Tue, 11 Mar 2025 09:24:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1323028000E; Tue, 11 Mar 2025 09:24:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E8156280001 for ; Tue, 11 Mar 2025 09:24:20 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9A26C58191 for ; Tue, 11 Mar 2025 13:24:21 +0000 (UTC) X-FDA: 83209339122.14.19FA44B Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by imf10.hostedemail.com (Postfix) with ESMTP id 5F20EC0019 for ; Tue, 11 Mar 2025 13:24:19 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kiFb78LW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=asml.silence@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741699459; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SB2j+2DJD1bu8gPq0cnVv7W46llV7q01ONp8A9vhfUg=; b=vQEn77Kj16yj0o7QSP0gp0x1tvCbsk2u77BNXcWa0b0qXrYiEVY0IYHSRPDQ9wCAPG9q7a ntR84zyZysI6/1DD0ZZCKRSx+nFTd/Kg09RKrE7Xxdu45nyqYlm2/bEON1hzf+tEs9hDYm zpj9XQoB/jGOHz7tpZjda9TJPBLa9hg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741699459; a=rsa-sha256; cv=none; b=Z/g5glw7xkuLr5yLcJ5giZSZEg6ZpSgipPBWC5//OdC6xuZnzfTQzCmWYv5MIoBaloMDMG 5U3HiYNPoRFzIxoel9dRXlxP0E2W2mY+kLbCdZ/79kL5Vqv34CxxqTSfpNi4yOVyOkG6AP nRbWQvPxN3UleN6wCYlcQak1+tPiKXA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kiFb78LW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=asml.silence@gmail.com Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-39133f709f5so2110185f8f.0 for ; Tue, 11 Mar 2025 06:24:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741699458; x=1742304258; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=SB2j+2DJD1bu8gPq0cnVv7W46llV7q01ONp8A9vhfUg=; b=kiFb78LWoMY1OR8IDoPUHCVQmkSvZdnu1Pu09DJB9twaSg2P1LsJxaoVWg8A8iEqck MvPLOGC/qXwV66eNfE6Wt7RUq/kf8m0yG11M4KCGGAgiI4dRD75zWBVVSxetcKLjbfGm km3A0cKsGd6cQWOl7WW2U+tiFlVlZ4wmQOPlyisBTNDg1RpPNFuW65L4auwDeztLDwSc pUBSlHoCsB3jH3cpnHD9Kzj4Tnh2QxX+J8RhzFXvTnETxxA8QHQkeG8EmkLeyaInqh2L ljOKfhGsLHxQG+dY/ymZaJ7a2mHihXeF8QnSuuurs8U28cM2fdVT9WD08vCcdwwsEG8/ SSLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741699458; x=1742304258; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SB2j+2DJD1bu8gPq0cnVv7W46llV7q01ONp8A9vhfUg=; b=PlBWqkr4G/kL+r/WY76WTN6HKinaAS2aKgISK+brv6K/F3j8w1zCRJTpGz/ZGsznZy y9Z1azqCxyXJvd8hLT6Q7VqEvJuld7dA7XXqnPBFOcI8hFpk4kIojhaXvXNXsq6ldtw1 TeRxF0Xs4aV4jDWV/DRC7n0/ac3igyiBX3zUxMCU2+t8ACQJQlGgTTTSZbAHgSadaQda muZYtuXR6aOljrFD/a7hGThHR2UvV8AXAy6uhWT5Y1AwLB+2ERgOoNsUTsLvHGl7Ku3H TE/dmkE7QFLpEmLgfoXkIWx58aLv1Ixe6MCmuRn5GUYVtmcO7OfJ939m8NNqhVTzEAsM SrIw== X-Forwarded-Encrypted: i=1; AJvYcCU26XbXIIYfvrzIn7hytXZTAmAfn++rR+sVcf0eBxnz8tF+ga4VsD+NbGjKJUNEmNcgt77RN+MOpg==@kvack.org X-Gm-Message-State: AOJu0YwDCR1ZTSyOvzKKC7jep6sQTqXx+tcvOYxiv2Y/923TPfDzVin4 j9hieZmjz5yLdzXyeWzchikYrW56vbzWcbPlyaRGPcwyPw+AxwhV X-Gm-Gg: ASbGncuqyq+Anq1SHGzAtRxy/0HmF0DiRnfM8P2DKtg3SyA4J30R5vvGVh0jp1R6iMb ef5PoVoANd5G+CH9z7bAvXITOWdA/TdkuWkjSWN+Swux3UvXZeQbF/tLWsmtDdd6r5bdRujtRh5 ZQpWJzDPtevGelrLddLHd3leTuFLu5LjjkKO6vlCmDNx1PaH1luMWThlf9cs4KgimoVuLWI4PIm I9ynizU/yP6cM6bVpKOQdr/+lGK/rLuFDXCRanNGAyBatDfOzulzdLY7dQLfE16DyCWisYDRaQp xgvDEiN4OmyZ9mgeq1P2p7NeiIjBCDZRvmx2NU70EtyJhjSGCXlX8l7Phw== X-Google-Smtp-Source: AGHT+IEG41Sh2atWO3RHBfoSb7Tx3IIwYhLEiEKsNbi+A9VfcSbMXi1f7yke56YANW9xiS3CuXBj5g== X-Received: by 2002:a05:6000:381:b0:391:4ca:490 with SMTP id ffacd0b85a97d-39132d76e70mr12811495f8f.29.1741699457734; Tue, 11 Mar 2025 06:24:17 -0700 (PDT) Received: from [192.168.116.141] ([148.252.129.108]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3912c0e2f10sm18253881f8f.65.2025.03.11.06.24.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 11 Mar 2025 06:24:16 -0700 (PDT) Message-ID: Date: Tue, 11 Mar 2025 13:25:11 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool To: =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Mina Almasry , David Wei Cc: Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Yunsheng Lin , Yonglong Liu , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , linux-mm@kvack.org, netdev@vger.kernel.org References: <20250308145500.14046-1-toke@redhat.com> <87cyeqml3d.fsf@toke.dk> Content-Language: en-US From: Pavel Begunkov In-Reply-To: <87cyeqml3d.fsf@toke.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 5F20EC0019 X-Rspamd-Server: rspam08 X-Stat-Signature: z4k3dy6buj5r6giki4jyro4zub9pfiac X-HE-Tag: 1741699459-764767 X-HE-Meta: U2FsdGVkX180q1jdkvfxehhrBbUHbYZ9BBMzj1IeQS4Clybv3shaQjtNJeMsc43gBUHyZ4UKsC7tquxv8AQ3q4blk0vOBdsbc7Y9AYGgym0HNMegXEIvkIp70uP+VkNngdIz/96ZAthG2zWd9zybN1vmRQIsJFE95hsHjDd+Ron8JmJqLmqBXuXNHrONZFWeekP5GCY45dl6tDBwGnS/dJrHFNoTcW14Cb5lttLVlh6YPb0Jq3pBTeUD4PZ8WjwaMxW8pICF7kyFu+IhnIutggOCbayEOqwrJjnnWZFbe/aRRnI03lIfpkvLoAiDZxjlLWNKNkMZgwGgjPs1c0OKlqzrWA8A0ZGcxqLpGE4TpM/MzFedKmPPmodvmGZFKIgEV0gbJmjWuJ3ynAKeYDZzp3NLkJR+MCVufbAl2/fhS2+gSrdXYUAK2sckhX6BHbUAv1VEK933maRx4FkViUXou1smgVUqP4QtZwJJiyXYDGEk9ZdirexAOXj+r6xuznsWGHKeThfDgJxJD3ao6mktZV79JG9nifvul79pwCLsOE3cphQL6Fw9F5knxWi8su7gX6E+JkquqWaFajkRn00O452AuE1UifHo6TiC13uiQfMJ6f+QTW4Z+JC7l08TqmZagKnONKH1UNe2VYM1P4eEwWhNhy02+wg+obaVb/xg6pX2vQ0V53cQgo8fEF9Sx4qjJrojpuwSl+EfNM3LbA0KD5HsnTHBkGZe9zvLh6cq3QzOC96+eZDykLDKeiz4fLXLS+kO6OlFOKzsnksX0NQwsKOtf5ic5Mp3Oq4yQhKP78NHGSh4CbtXGYG5kzIDOXHvPsEmOhq+hNCkYqOjzXj773DonnOsgyFZ3VFuSKvIzdZHnae050p3jELd6oQGyB/WTEoxqeqwkPgEtB9e320rdO1r6f5aIqAgUOtxXXj0E7Ezz29bSqmIOHcO0tmzo4gJihZm/LHitaoc6gqTkHQ xH7oMAqz 0ZQJAIaOwfb5OrQYrvqxMPpmG/AXhTX4+++4hXGiI1qf6g1cu1I0dOCpqpYQjJhd8LUZz8HwKFtW0KCQoiiEMrUligSFvCxJKSea98GbppUcyN8bg7bic8d7OaSIuw2FanAdyz/7mb/tyo57QmoFRJkbfXQ89xKVxae+qfs9DN0caifcBNW08qFCuoogyRHRJcdXeHarhqjs5L6+w4agIHGDrPq8F7Ywna7ycwMjSSV9j2Una0BlamxBxwmKM4OGTFU6ynGr5vndwGP2mUzFudcz5p17oZY2eQZBe7z7UUzulpeQQkWDipWJGEUpzEMRIlR3ncVakkFl3N8n7vaIn+PJNvIxRZG5uFSrs3cxssz0kDQYkg4mzvYSqoy6T5rMAECzuirH/Xakle/1sJTMVVMvAzYD1cJuuIwkxVN2lG1zO7uDP+uAWe9wzXNis06xa6B4RfXzPWW4jA+cYLvEeind9hmPhQpYJHv7uF3axhqyTzf9AFuG1Th03hDDAD4Qb7VAkCjqx3GgGCAGdjJ9QjqyHLhnqwWu8zEf43XBXQqK6X+E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/9/25 12:42, Toke Høiland-Jørgensen wrote: > Mina Almasry writes: > >> On Sat, Mar 8, 2025 at 6:55 AM Toke Høiland-Jørgensen wrote: >>> >>> When enabling DMA mapping in page_pool, pages are kept DMA mapped until >>> they are released from the pool, to avoid the overhead of re-mapping the >>> pages every time they are used. This causes problems when a device is >>> torn down, because the page pool can't unmap the pages until they are >>> returned to the pool. This causes resource leaks and/or crashes when >>> there are pages still outstanding while the device is torn down, because >>> page_pool will attempt an unmap of a non-existent DMA device on the >>> subsequent page return. >>> >>> To fix this, implement a simple tracking of outstanding dma-mapped pages >>> in page pool using an xarray. This was first suggested by Mina[0], and >>> turns out to be fairly straight forward: We simply store pointers to >>> pages directly in the xarray with xa_alloc() when they are first DMA >>> mapped, and remove them from the array on unmap. Then, when a page pool >>> is torn down, it can simply walk the xarray and unmap all pages still >>> present there before returning, which also allows us to get rid of the >>> get/put_device() calls in page_pool. >> >> THANK YOU!! I had been looking at the other proposals to fix this here >> and there and I had similar feelings to you. They add lots of code >> changes and the code changes themselves were hard for me to >> understand. I hope we can make this simpler approach work. > > You're welcome :) > And yeah, me too! > >>> Using xa_cmpxchg(), no additional >>> synchronisation is needed, as a page will only ever be unmapped once. >>> >> >> Very clever. I had been wondering how to handle the concurrency. I >> also think this works. > > Thanks! > >>> To avoid having to walk the entire xarray on unmap to find the page >>> reference, we stash the ID assigned by xa_alloc() into the page >>> structure itself, in the field previously called '_pp_mapping_pad' in >>> the page_pool struct inside struct page. This field overlaps with the >>> page->mapping pointer, which may turn out to be problematic, so an >>> alternative is probably needed. Sticking the ID into some of the upper >>> bits of page->pp_magic may work as an alternative, but that requires >>> further investigation. Using the 'mapping' field works well enough as >>> a demonstration for this RFC, though. >>> >> >> I'm unsure about this. I think page->mapping may be used when we map >> the page to the userspace in TCP zerocopy, but I'm really not sure. >> Yes, finding somewhere else to put the id would be ideal. Do we really >> need a full unsigned long for the pp_magic? > > No, pp_magic was also my backup plan (see the other thread). Tried > actually doing that now, and while there's a bit of complication due to > the varying definitions of POISON_POINTER_DELTA across architectures, > but it seems that this can be defined at compile time. I'll send a v2 > RFC with this change. FWIW, personally I like this one much more than an extra indirection to pp. If we're out of space in the page, why can't we use struct page * as indices into the xarray? Ala struct page *p = ...; xa_store(xarray, index=(unsigned long)p, p); Indices wouldn't be nicely packed, but it's still a map. Is there a problem with that I didn't consider? -- Pavel Begunkov