From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE21CC072A2 for ; Wed, 15 Nov 2023 18:07:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2FBC66B03AF; Wed, 15 Nov 2023 13:07:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AB166B03B1; Wed, 15 Nov 2023 13:07:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1723A6B03B2; Wed, 15 Nov 2023 13:07:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 08E336B03AF for ; Wed, 15 Nov 2023 13:07:26 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B6CF7B53CC for ; Wed, 15 Nov 2023 18:07:25 +0000 (UTC) X-FDA: 81460970850.24.0F6CEC0 Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com [209.85.217.44]) by imf27.hostedemail.com (Postfix) with ESMTP id C93A340025 for ; Wed, 15 Nov 2023 18:07:23 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AwLtnyxF; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of almasrymina@google.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700071643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=237jqdz3sNQ/lMkcIJ++z3uG+l/JTotWsahyw09GVyw=; b=vTTVUeAd3TUJ+aWiIecYq+hwbl0VHoRh7uyGo6/0aRpnw3SQxQ3Gjk29snWRntCpuXTBba q6mD+ol5sOtjcJtk4Kt5s2OOFJTjuDo7DdPX4aVjjykrS7m7wRHgu5V+87cuMBdd54JMsP UcVVXyUFFDvAUQxt3/tYUq0ifdmg0to= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AwLtnyxF; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of almasrymina@google.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700071643; a=rsa-sha256; cv=none; b=zKj83ZdzeGes4R5Z1s6IzblaBsctklT/SujrASOruNgqb2uBamA7ei4Drge1wK7l/yGt0n IZWhcrYD6sh/OXDG9XGFdRN7fPMidBVXaWlxerQeHcuI4D3NEOpEmxLTk1eUSh+GYIbDJE hRh82Ogk3ohpLx79FEqcX++av4Jf3l0= Received: by mail-vs1-f44.google.com with SMTP id ada2fe7eead31-45db2c722d0so694156137.1 for ; Wed, 15 Nov 2023 10:07:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700071643; x=1700676443; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=237jqdz3sNQ/lMkcIJ++z3uG+l/JTotWsahyw09GVyw=; b=AwLtnyxFnxaXBR7rNzw3XwjzttQyRL8odg1devCadMpFF53Ek6lERQhr9aVgjhxlrz gNBat+krepHjP1rxQeqYMiuMoef+utooyNyf8PKJC2MK4juK5bDcK9znIAqLu3ji1M27 MET79EWw/vfP/0p/W/DQKxqtW9I4sH4h9pP/L0iMBof9CJZCAmim8/4n+k1o0BBmpVDi 6h9TugPFawF9CSh5fH+nVKbvYhfNZXuy6yPcvi6RC0N9HbKoyKUMMtfFbJ2bwSyuyUng 74b10UXy2/gYLfnjkHAPJwxtIk/r/Ub4an0FggV3obAvUuZC/HQo/Krbn7pvjm9x4op5 fNYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700071643; x=1700676443; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=237jqdz3sNQ/lMkcIJ++z3uG+l/JTotWsahyw09GVyw=; b=wWElIqE76cvO0KlI4eiLDkLw78GjPn34zk219c5A4nt9/us6vSxgvQAQ1u9/pH9k0w Hqq1ZZPGXleAQK996BPeq5c08JA48Do9dp8sxvNSEv05W2tpTeaCp3kg5vZCDrO4qL06 F6k3SGkQXDbgTKTP2yvEoApZEVc/DyL1MIyYu+cOcaUl0yM8f4kEOKvhgjK28KD2ATQh tp15NMJbk6/3nkK48Kw0OzpTijDPZ7g1F7v+rW+A9bN+L6AaID/wZ9M1fNvcVsPS7RQQ tYU9wo2d+wooK9Go8L8e8cgzQ8M0qIZD5gBTrhaDkVXUoOj/vrSJ9majkzP5QTTk6iqv DpPg== X-Gm-Message-State: AOJu0YwMSmF01T3bPmOLj8G6HrwXlRMEwIazeUGdBP6EbWODaPlQESUc bh3RmUbtH4gbTGjgZM94Qfrt3e+BLhVkHK/rlb7dAw== X-Google-Smtp-Source: AGHT+IG8CUZxP+tWJWjpLoHj4LC1OTaS/XqIvVoVoDj81d6WPEvc60nVIzPKPJjPrVoQRYOjBqsIakFeKXa7RcJYk5g= X-Received: by 2002:a05:6102:5c5:b0:452:6d82:56e3 with SMTP id v5-20020a05610205c500b004526d8256e3mr4378226vsf.6.1700071642710; Wed, 15 Nov 2023 10:07:22 -0800 (PST) MIME-Version: 1.0 References: <20231113130041.58124-1-linyunsheng@huawei.com> <20231113130041.58124-4-linyunsheng@huawei.com> <20231113180554.1d1c6b1a@kernel.org> <0c39bd57-5d67-3255-9da2-3f3194ee5a66@huawei.com> <3ff54a20-7e5f-562a-ca2e-b078cc4b4120@huawei.com> <6553954141762_1245c529423@willemb.c.googlers.com.notmuch> <8b7d25eb-1f10-3e37-8753-92b42da3fb34@huawei.com> In-Reply-To: <8b7d25eb-1f10-3e37-8753-92b42da3fb34@huawei.com> From: Mina Almasry Date: Wed, 15 Nov 2023 10:07:11 -0800 Message-ID: Subject: Re: [PATCH RFC 3/8] memory-provider: dmabuf devmem memory provider To: Yunsheng Lin Cc: Willem de Bruijn , Jakub Kicinski , davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Willem de Bruijn , Kaiyuan Zhang , Jesper Dangaard Brouer , Ilias Apalodimas , Eric Dumazet , =?UTF-8?Q?Christian_K=C3=B6nig?= , Jason Gunthorpe , Matthew Wilcox , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C93A340025 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: iyu39d86j3p38gyece4g53fj1zhehgps X-HE-Tag: 1700071643-225834 X-HE-Meta: U2FsdGVkX19yEd0dOYd4q2wcl6EY1483K0kHGx7IqJuebItyfwhbolcsvCUPszng15122BDmQDuLUJTgoShH3+NMf5nZhGiCahXLCOioZ92Pw6rilk10AvAcp3koTFkzqKt7SH0XuaaFcRDADB7tqyhpY6RnesFbipcHMechTed3LXYnawuk8KJKyahhjGTbZ9sQMOhgG8Swd1uw9FyL1ux/oL3d9XrTSOA4YNaOw/bTLIgExA/fpef0Vx6bjVdjD6Xn9LX857HKcGYinfBkdKK4NF3pwHXflBWuqQUsDGRVKNObPbY2S4f9211EIsJevN+3RoI0ebHF52lyOB1AQCpS9yU+NFAt3PeE49lY7vzzcmDhiNDWCM85K3bgcBrrRVkRK7seO/63Ry6Y05wG6X9VXFQHnUFtDJPK9L1RBRXArB8p5C/YJCgQjLvzBMu068qzS3PEUYwQ/Lj5ki0rwNYo4ANleJSzkv7EXmajFeMeCNKeoYjvGczUbpVUT0FDetHH3M3qsw9xq7SBjqVOHxoXxsEI+oHDXXQ4nkUUdE0BJ0z3JYMcqyemBm2ujrTuqo0Nb3avFunTxnUmIAoYzIhtpWCTn+fz5wPCXwRykMit3RgesMeSjReZvi6G8p2MwnJcqEuFxxPQa/CTT5P5YnCUkXljKYLCKurgFXsGfYySPaP2ZTn8FOdFC7ZO9jGyaca9gy+x74KVJXn0wHOXL3QUy/vVyyWHKTwJr3Rz/o+VpxB6xxvNY4qC7c+vGBzr5dSukAr/5iWmpmHhYLido2QGzaaJYuNJCQis5s6ex6OELT9ZzowC3hO8nI5HVzr+1/4lYsF9tv3OTdY+Xa/sC9hM980HqrlDOvou+NoF+eE20YpOhG2mOSfwvuikdnia6hhTBzr+MgB3Cm1vsGjiLOCbI9JOIPBDSCRwPap12UBp5RXmllCi0STKEgsJYdYEutJwA3BI91PYA9jzO4G dfPVRSWD XZ1lTEuMXExhS4CiwCK66qFZHGYHIrI8OVnE1Sn9qNUNVO318dtsdEsdhFab6w+jsVIEpNWfo+VjwmokbTVglU6IHgv3djB6T+v7Ja/KpUJMP4JRq4h33SVhEbWcdAAxBsPlTMSVc/RJXbnPb9IIfhJ30kTqyQ7VDQGF/NlW00f6ip0cLdhn63mpP9baQXlzurK8L39uo0RB7SrWSLBc4caztK1QJx8TXqpp1hRTZfTw6/HZwWQR6PvMIgWBT+yjsl4kkmu40hxBPi97gnI4tFq9pGAQR85Cdvkp/anJpaVQkFLs928WoZl/+hIVB5FaQ2+G/myZzjcazWtNwwmoZMcqZmDfrqM6XFqmk5mcE5WnNuQ/GV2J6epylshEt31SV2VXPniDVDXtevHVxYKaZiKc/hIzwoclppF4eK9Jk3c9Jc1XqNKulsfgMg2GZ9HAT7ShdNCqA94gzG6Gi4exeYJbABPMggDO87DvwgApmI36MeHBGjt0TB0XRww== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 15, 2023 at 1:29=E2=80=AFAM Yunsheng Lin wrote: > > On 2023/11/14 23:41, Willem de Bruijn wrote: > >> > >> I am not sure dma-buf maintainer's concern is still there with this pa= tchset. > >> > >> Whatever name you calling it for the struct, however you arrange each = field > >> in the struct, some metadata is always needed for dmabuf to intergrate= into > >> page pool. > >> > >> If the above is true, why not utilize the 'struct page' to have more u= nified > >> handling? > > > > My understanding is that there is a general preference to simplify stru= ct > > page, and at the least not move in the other direction by overloading t= he > > struct in new ways. > > As my understanding, the new struct is just mirroring the struct page poo= l > is already using, see: > https://elixir.free-electrons.com/linux/v6.7-rc1/source/include/linux/mm_= types.h#L119 > > If there is simplifying to the struct page_pool is using, I think the new > stuct the devmem memory provider is using can adjust accordingly. > > As a matter of fact, I think the way 'struct page' for devmem is decouple= d > from mm subsystem may provide a way to simplify or decoupled the already > existing 'struct page' used in netstack from mm subsystem, before this > patchset, it seems we have the below types of 'struct page': > 1. page allocated in the netstack using page pool. > 2. page allocated in the netstack using buddy allocator. > 3. page allocated in other subsystem and passed to the netstack, such as > zcopy or spliced page? > > If we can decouple 'struct page' for devmem from mm subsystem, we may be = able > to decouple the above 'struct page' from mm subsystem one by one. > > > > > If using struct page for something that is not memory, there is ZONE_DE= VICE. > > But using that correctly is non-trivial: > > > > https://lore.kernel.org/all/ZKyZBbKEpmkFkpWV@ziepe.ca/ > > > > Since all we need is a handle that does not leave the network stack, > > a network specific struct like page_pool_iov entirely avoids this issue= . > > Yes, I am agree about the network specific struct. > I am wondering if we can make the struct more generic if we want to > intergrate it into page_pool and use it in net stack. > > > RFC v3 seems like a good simplification over RFC v1 in that regard to m= e. > > I was also pleasantly surprised how minimal the change to the users of > > skb_frag_t actually proved to be. > > Yes, I am agreed about that too. Maybe we can make it simpler by using > a more abstract struct as page_pool, and utilize some features of > page_pool too. > > For example, from the page_pool doc, page_pool have fast cache and > ptr-ring cache as below, but if napi_frag_unref() call > page_pool_page_put_many() and return the dmabuf chunk directly to > gen_pool in the memory provider, then it seems we are bypassing the > below caches in the page_pool. > I think you're just misunderstanding the code. The page recycling works with my patchset. napi_frag_unref() calls napi_pp_put_page() if recycle =3D=3D true, and that works the same with devmem as with regular pages. If recycle =3D=3D false, we call page_pool_page_put_many() which will call put_page() for regular pages and page_pool_iov_put_many() for devmem pages. So, the memory recycling works exactly the same as before with devmem as with regular pages. In my tests I do see the devmem being recycled correctly. We are not bypassing any caches. > +------------------+ > | Driver | > +------------------+ > ^ > | > | > | > v > +--------------------------------------------+ > | request memory | > +--------------------------------------------+ > ^ ^ > | | > | Pool empty | Pool has entries > | | > v v > +-----------------------+ +------------------------+ > | alloc (and map) pages | | get page from cache | > +-----------------------+ +------------------------+ > ^ ^ > | | > | cache available | No entries, re= fill > | | from ptr-ring > | | > v v > +-----------------+ +------------------+ > | Fast cache | | ptr-ring cache | > +-----------------+ +------------------+ > > > > > > . > > --=20 Thanks, Mina