From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EDFDC282EC for ; Tue, 11 Mar 2025 12:25:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3696280002; Tue, 11 Mar 2025 08:25:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE63C280001; Tue, 11 Mar 2025 08:25:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAE45280002; Tue, 11 Mar 2025 08:25:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ADE80280001 for ; Tue, 11 Mar 2025 08:25:31 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D2F701A1E41 for ; Tue, 11 Mar 2025 12:25:33 +0000 (UTC) X-FDA: 83209190946.28.14C7B79 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf07.hostedemail.com (Postfix) with ESMTP id 0F40440006 for ; Tue, 11 Mar 2025 12:25:30 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741695931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=neyKVk1gbLXj1xa/+I93Rr5Gq2iNPi6JjhWIuYFxgLg=; b=1Ix/fruKvdG5hcS8Qza0/gf0F7vKhxd4YnJYE5GGGCG0e2nh8ymbn8b0Cmmttu4fCkBikk 43Tr1FHq1KhRu3TvckdlGfwQwOprP7XFzWPZckrKa2UQJz/kQUVcVkdXyAFd556WIbDfn7 2cRTRDf3YkfCUXXhdG2jw1aDuBoczX8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741695931; a=rsa-sha256; cv=none; b=NO/Wv2BZWRFCzWC59IzQet8YeaBuXaF8h8kJi7HX2HHqBDnyZBzc+1gnakEBuanNE4bK/g wsXjCQOl/RhD8rJF7PtnUU0NvAWi8Ph+urDEfox8MBhK5arxQVVSWKa9mpEhFB9YUeAe/N 1ol4/f8tzTfsQCwEoCoZOtf2dVJn118= Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4ZBtDl3d7Kz2CcBv; Tue, 11 Mar 2025 20:22:15 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id E3FF3140158; Tue, 11 Mar 2025 20:25:25 +0800 (CST) Received: from [10.67.120.129] (10.67.120.129) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 11 Mar 2025 20:25:25 +0800 Message-ID: Date: Tue, 11 Mar 2025 20:25:25 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool To: Matthew Wilcox , =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= CC: Yunsheng Lin , Andrew Morton , Jesper Dangaard Brouer , Ilias Apalodimas , "David S. Miller" , Yonglong Liu , Mina Almasry , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , , References: <20250308145500.14046-1-toke@redhat.com> <87cyepxn7n.fsf@toke.dk> Content-Language: en-US From: Yunsheng Lin In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.120.129] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Stat-Signature: tpqwqi4yq6xhj43ft7x4pciduieum7pm X-Rspam-User: X-Rspamd-Queue-Id: 0F40440006 X-Rspamd-Server: rspam01 X-HE-Tag: 1741695930-439255 X-HE-Meta: U2FsdGVkX1/C74ACS8v34jMkVyYWTLi5Q+iXpTo8gQLIBVUtDw0PnrtZolen2wXpjhhNbEhJ62oHAdO8IcHxoraXY9WdDQhz3Ipr8L+lKhdcl3h2H+Lfqkpu/kKP/yioJSaTFHg/xpETiT8s3hIvQqCcB6glWjuteaSccwmjQC+8ePp6PWCrvYL7Pz5PT2PJ8RsL2lJB9KILzrUkOL8W54n3svNsoK6N8wp/s402gQnr5dpcmvOj39IKnzXbnmbrfjUjNsLDgH8pDeqcvsXvNoYF4W5ILbi91OSaP/gozXcnD131F+JTq55c4n3SvWHFPfojzTNtJiXw5R6VqODRdKTVBawR6EnmCEpBAF5jCm1OdAPANtlzeZUd792Vm1QEKk6CGoVO+4xzXnpaRIU9Z/UwHDoOUAHXjBy/l5Q3DtjjEj4jIGN0he3DrbSpeKnDChPERjmv3uFrVyqThbbdG4mnkYbWOX0gQg0dytsIZW+ZCJ0GF2fjku7nusesIY5bSHWzVqQjw821O6uCcwT0yWCxhTZZ4QxJ+4xOtD63rf0oVBzXk/pgsk1CK4kglWQ2u8ose+K1c9k6N2rpXIO4fjRqSQDyptDA4IOVEbB5NlW/gkhEBEvOQd2dx4jWHjFLk05gKm96oKikng36KnVqfp6ELwZeVXF3QS5L0SuoGXz61PdYMRyiJs6uWWgBaDHeUK21W+psS2r/rGdy9YvhYhghlHrQiLpNrihSj3iNAa/EaHFdiQXt0los4G1XwBkxbRYFfJCNRwsFJSW4zfpdewodIUmsSFQ/vKcVbwMxXzISkVbn4r5ArBDt6/51bdbGoaXPvSaRwUqVAFQJQWzg6/q4XaJ73XzLWcFYTHecRaaJ3b529uxPNia38K9tGdsg4eRgICVt06trNu7g2T0lPR00L69Bcz9kcs31xNTdNkv9ZHXbj99+teH3o+b02gpSqTWGS/pUrpO8OiMluLc zrkOv8Do 8Z6ibJp6ZdFYyMso1H4HL3CpdpKVE0hVSZUeb3jXUS+IFSMs4Wf4QJjNZTPl/Q8tasAhtgrGQEHBJYywvnc/jLH3vqupE9+ytne3V96f6n0yyxzPgH81P51qt3YFncZDpLIA97WozHl7ach3YzuoWQGLYbx1HO9qpVZMZ8+GZQianetVElPw0n7vJ2VrNyXEjyN6SMwPDcbu9LbPoFBULv77zqTkTAiUnIxoEcAmIu/WQBIS1GUQwXC6QhGKu84P9abRRrj+hAf5f2Y591Nf12zd0RKOI/avsG3217vzmbREHz252SBg26pqIZnLjOUO1siCxS+x3++eFdIE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/3/10 23:42, Matthew Wilcox wrote: > On Mon, Mar 10, 2025 at 10:13:32AM +0100, Toke Høiland-Jørgensen wrote: >> Yunsheng Lin writes: >>> Also, Using the more space in 'struct page' for the page_pool seems to >>> make page_pool more coupled to the mm subsystem, which seems to not >>> align with the folios work that is trying to decouple non-mm subsystem >>> from the mm subsystem by avoid other subsystem using more of the 'struct >>> page' as metadata from the long term point of view. >> >> This seems a bit theoretical; any future changes of struct page would >> have to shuffle things around so we still have the ID available, >> obviously :) > > See https://kernelnewbies.org/MatthewWilcox/Memdescs > and more immediately > https://kernelnewbies.org/MatthewWilcox/Memdescs/Path > > pagepool is going to be renamed "bump" because it's a bump allocator and > "pagepool" is a nonsense name. I haven't looked into it in a lot of > detail yet, but in the not-too-distant future, struct page will look > like this (from your point of view): > > struct page { > unsigned long flags; > unsigned long memdesc; It seems there may be memory behind the above 'memdesc' with different size and layout for different subsystem? I am not sure if I understand the case of the same page might be handle in two subsystems concurrently or a page is allocated in one subsystem and then passed to be handled in other subsystem, for examlpe: page_pool owned page is mmap'ed into user space through tcp zero copy, see tcp_zerocopy_vm_insert_batch(), it seems the same page is handled in both networking/page_pool and vm subsystem? And page->mapping seems to have been moved into 'memdesc' as there is no 'mapping' field in 'struct page' you list here? Does we need a similar field like 'mapping' in the 'memdesc' for page_pool subsystem to support tcp zero copy? > int _refcount; // 0 for bump > union { > unsigned long private; > atomic_t _mapcount; // maybe used by bump? not sure > }; > }; > > 'memdesc' will be a pointer to struct bump with the bottom four bits of > that pointer indicating that it's a struct bump pointer (and not, say, a > folio or a slab). The above seems similar as what I was doing, the difference seems to be that memory behind the above pointer is managed by page_pool itself instead of mm subsystem allocating 'memdesc' memory from a slab cache? > > So if you allocate a multi-page bump, you'll get N of these pages, > and they'll all point to the same struct bump where you'll maintain > your actual refcount. And you'll be able to grow struct bump to your > heart's content. I don't know exactly what struct bump looks like, > but the core mm will have no requirements on you. >